Using the package

Q: How do I run data checks in the package?

A: Please see the Getting started page of the package.


Q: Can the data check functions be used on SDTM studies with SUPPQUAL domains?

A: The package is typically used on transformed SDTM data that has SUPPQUAL variables appended to the parent domain. Most check functions but not all check functions should work if the input data uses SUPPQUAL domains.


Q: How is the designation of priority of High, Medium, Low assigned?

A: The sdtmchecksmeta.RData dataset includes designation of priority. High priority typically is assigned for data checks related to death, disposition, and key safety concepts. Please note that this assignment is subjective and the package has general checks but may not cover critical domains that are study-specific. The priority level designation can be searched via the “Priority” column in Search Data Check Functions page.


Q: Why doesn’t the package install from CRAN properly?

A: The package is not available on CRAN. Please see the installation syntax here.


Data checks

Q: What data checks are included in the package?

A: The list of data checks are listed on the Reference section, with functions beginning with check_.... These check functions can be searched via the Search Data Check Functions page.


Relationship to other tools and data standards

Q: What is the difference between data checks of Pinnacle 21 (P21) and the sdtmchecks packages?

A: P21 validation checks are for CDISC conformance and quite comprehensive for eSubmission, whereas the sdtmchecks package aims to cover variables impacting analysis. There may be some overlap in checks from these two tools - for example, AEs with missing preferred terms are flagged with both tools. CDISC codelist versions are not used as reference for the sdtmchecks package. Checks of controlled terminology and inclusion of all codelist items are covered in P21; data checks of this nature are not included in sdtmchecks.


Q: Can the data checks in this package be run in lieu of P21 validation checks?

A: No! The checks of this package might be considered complementary. There are definitely coverage gaps.


Q: What are the coverage gaps in the data checks in the package?

A: In general, non-standard data are not covered by these checks. In oncology Response (RS) domain data, the emphasis is on investigator-based results rather than independent review facility data reported by a vendor in light of the former typically being more standardized.


Q: Does SDTM Implementation Guide (IG) version information need to be entered in order to use the package?

A: No, the data checks do not require IG version information as initial input.


Q: Does the SDTM Controlled Terminology (CT) version need to be entered in order to use the package?

A: No, the data checks do not require CT version information as initial input. The search for strings is relatively broad and intended to cover older and newer CT versions.

For example, RSTESTCD with “OVRLRESP” is expected as the value for Overall Response in RS for both solid tumor and hematology studies in oncology, and this string value is consistent across specific standards versions. If your study assigns a different, non-standard value to indicate Overall Response, then checking records through study-specific code will ensure important analysis-impacting data are not missed.


Q: Which SDTM Implementation Guide (IG) version is used as the basis of the checks?

A: The existing data checks are intended to run regardless of SDTM IG and CT versions. The package emphasis is on general data issues that impact analysis. Most checks run regardless of the study build date and SDTM version.

Some workarounds have been incorporated in data checks that require variables with known differences across versions.

For example, some data checks functions include EXOCCUR, a variable in versions prior to SDTM IG 3.2/SDTM 1.4. If your study does not have an EXOCCUR variable in the EX domain, then running the function check_ex_exdose_exoccur, which checks for EXDOSE>0 when EXOCCUR is not ‘Y’ will return “EX is missing the variable: EXOCCUR”.

Later SDTM versions use Link ID variables –LNKID rather than –LINKID. A check function, such as check_tr_trstresn_ldiam, allows either TRLNKID or TRLINKID and does not require the user to enter an SDTM version.

Please feel free to share recommendations on making the checks more robust and version agnostic via logging an issue.


Package maintenance and release schedule

Q: What is the package update release schedule?

A: The release schedule is TBD.


Contributing

Q: How can I contribute to the package?

A: Please feel free to share issues and suggestions on the Issues of the sdtmchecks GitHub repo. Refer to the Writing a New Check guidance to contribute to the codebase.


Q: Can I contribute study-specific checks to the package?

A: Data checks that are overly-specific and not as generalizable likely will not be added to the package, but information may be included in the documentation for users to incorporate ad hoc checks.