library(pharmaverseraw)
library(sdtm.oak)
library(dplyr)
# Read in raw data
<- pharmaverseraw::ae_raw
ae_raw
# Derive oak_id_vars
<- ae_raw %>%
ae_raw generate_oak_id_vars(
pat_var = "PATNUM",
raw_src = "ae_raw"
)
# Map AETERM and AESDTH variables for AE domain
<-
ae # Derive topic variable
# Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM
assign_no_ct(
raw_dat = ae_raw,
raw_var = "IT.AETERM",
tgt_var = "AETERM",
id_vars = oak_id_vars()
%>%
) # Map AESDTH using hardcode_no_ct and condition_add, raw_var=IT.AESDTH, tgt_var=AESDTH
# If Yes then AESDTH = Y else Not submitted
hardcode_no_ct(
raw_dat = condition_add(ae_raw, IT.AESDTH == "Yes"),
raw_var = "IT.AESDTH",
tgt_var = "AESDTH",
tgt_val = "Y",
id_vars = oak_id_vars()
%>%
) hardcode_no_ct(
raw_dat = condition_add(ae_raw, IT.AESDTH != "Yes"),
raw_var = "IT.AESDTH",
tgt_var = "AESDTH",
tgt_val = "Not Submitted",
id_vars = oak_id_vars()
)
Background
The
Why now?
With the recent release of the
What is in
pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} ?
The
- AE: Adverse Events
- DS: Subject Disposition
- DM: Demographics
- EC/EX: Exposure
These raw datasets in
- EDC agnostic: They are not tied to any specific Electronic Data Capture (EDC) system like Rave or Veeva.
- Standards agnostic: Some variables follow CDASH (Clinical Data Acquisition Standards Harmonization), while others do not. This reflects real-world data standards variability across companies.
The annotated case report forms corresponding to the raw datasets are also present in the inst\acrf
folder. These PDF files illustrate how each raw variable aligns with SDTM expectations, offering insight into the mapping logic used in
How are these datasets created?
The datasets in
From raw to SDTM
Using
Below is an example snippet that shows how to use a raw AE dataset from
Get involved
Similar to the other tools under pharmaverse umbrella, the
If you’d like to get involved, here are some ways you can help:
- Add new raw datasets: Take other SDTM domains from the
- Create mock aCRFs for the raw datasets: Develop annotated case report forms (aCRFs) that illustrate how the raw variables are mapped to align with SDTM standards.
- Prepare documentation: For each raw dataset you create, include documentation that explains the data structure, variable definitions, and any relevant notes.
Whether you’re a programmer, CDISC expert, or clinical data manager, your contributions can make a meaningful impact. Visit the github repository to open up a issue or start a discussion.
Last updated
2025-07-07 17:06:21.537118
Details
Reuse
Citation
@online{chen2025,
author = {Chen, Shiyu},
title = {Raw Data for Domains in the Pharmaversesdtm Package},
date = {2025-07-07},
url = {https://pharmaverse.github.io/blog/posts/2025-07-07_pharmaverse.../pharmaverseraw__package.html},
langid = {en}
}