pharmaverse examples
  1. SDTM
  2. AE
  • Introduction

  • SDTM
    • DM
    • VS
    • AE
  • ADaM
    • ADSL
    • ADPC
    • ADPPK
    • ADRS
    • ADTTE
    • ADVS
    • ADAE
  • TLG
    • Demographic Table
    • Adverse Events
    • Pharmacokinetic
  • Interactive
    • teal applications
  • Logs
    • The Difference Between logr, logrx, and whirl
  • eSubmission
    • eSubmission

  • Session Info
  • Pharmaverse Home
  1. SDTM
  2. AE

AE

Introduction

This article describes how to create an events SDTM domain using the {sdtm.oak} package. Examples are currently presented and tested in the context of the AE domain.

Before reading this article, it is recommended that users review some of the articles in the package documentation of {sdtm.oak} to understand some of the key concepts: Algorithms & Sub-Algorithms, Creating an Interventions Domain, which provides a detailed explanation of various concepts in {sdtm.oak}, such as oak_id_vars, condition_add, etc. It also offers guidance on which mapping algorithms or functions to use for different mappings and provides a more detailed explanation of how these mapping algorithms or functions work.

In this article, we will dive directly into programming and provide further explanation only where it is required.

Programming workflow

  • Read in data
  • Create oak_id_vars
  • Read in CT
  • Map Topic Variable
  • Map Rest of the Variables
  • Repeat Map Topic and Map Rest

Repeat the above steps for different raw datasets before proceeding with the below steps.

  • Create SDTM derived variables
  • Add Labels and Attributes

Read in data

Read all the raw datasets into the environment. In this example, the raw dataset name is ae_raw. Users can read them from the {pharmaverseraw} package using the below code:

library(sdtm.oak)
library(pharmaverseraw)
library(dplyr)

ae_raw <- pharmaverseraw::ae_raw

Adverse Events Raw dataset

Sample of Data

Read in the DM domain

dm <- pharmaversesdtm::dm

SDTM aCRF

The mock up of the Adverse Events aCRF can be viewed here:

Create oak_id_vars

ae_raw <- ae_raw %>%
  generate_oak_id_vars(
    pat_var = "PATNUM",
    raw_src = "ae_raw"
  )

Sample of Data

Read in CT

Controlled Terminology is part of the SDTM specification and it is prepared by the user. In this example, the study controlled terminology name is sdtm_ct.csv. Users can read it from the package using the below code:

study_ct <- read.csv("metadata/sdtm_ct.csv")

Sample of Data

Map Topic Variable

The topic variable is mapped as a first step in the mapping process. It is the primary variable in the SDTM domain. The rest of the variables add further definition to the topic variable. In this example, the topic variable is AETERM. It is mapped from the raw dataset column IT.AETERM. The mapping logic is Map the collected value in the ae_raw dataset IT.AETERM variable to AE.AETERM.

This mapping does not involve any controlled terminology. The sdtm.oak::assign_no_ct() function is used for mapping. Once the topic variable is mapped, the Qualifier, Identifier, and Timing variables can be mapped.

ae <-
  # Derive topic variable
  # Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM
  assign_no_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AETERM",
    tgt_var = "AETERM",
    id_vars = oak_id_vars()
  )

Sample of Data

Map Rest of the Variables

The Qualifiers, Identifiers, and Timing Variables can be mapped in any order.

ae <- ae %>%
  # Map AEOUT using assign_ct, raw_var=AEOUTCOME, tgt_var=AEOUT
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AEOUTCOME",
    tgt_var = "AEOUT",
    ct_spec = study_ct,
    ct_clst = "C66768",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESEV using assign_no_ct, raw_var=IT.AESEV, tgt_var=AESEV
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESEV",
    tgt_var = "AESEV",
    ct_spec = study_ct,
    ct_clst = "C66769",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESER using assign_no_ct, raw_var=IT.AESER, tgt_var=AESER
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESER",
    tgt_var = "AESER",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEACN using assign_no_ct, raw_var=IT.AEACN, tgt_var=AEACN
  assign_no_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AEACN",
    tgt_var = "AEACN",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEREL using assign_ct, raw_var=IT.AEREL, tgt_var=AEREL
  # User-added codelist is in the ct,
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AEREL",
    tgt_var = "AEREL",
    ct_spec = study_ct,
    ct_clst = "AEREL",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESCAN using assign_ct, raw_var=AESCAN, tgt_var=AESCAN
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESCAN",
    tgt_var = "AESCAN",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESCNO using assign_ct, raw_var=AESCNO, tgt_var=AESCNO
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESCNO",
    tgt_var = "AESCONG",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEDIS using assign_ct, raw_var=AEDIS, tgt_var=AEDIS
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AEDIS",
    tgt_var = "AESDISAB",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESDTH using assign_ct, raw_var=IT.AESDTH, tgt_var=AESDTH
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESDTH",
    tgt_var = "AESDTH",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESHOSP using assign_ct, raw_var=IT.AESHOSP, tgt_var=AESHOSP
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESHOSP",
    tgt_var = "AESHOSP",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESLIFE using assign_ct, raw_var=IT.AESLIFE, tgt_var=AESLIFE
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESLIFE",
    tgt_var = "AESLIFE",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESOD using assign_ct, raw_var=AESOD, tgt_var=AESOD
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESOD",
    tgt_var = "AESOD",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEDTC using assign_datetime, raw_var=AEDTCOL
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "AEDTCOL",
    tgt_var = "AEDTC",
    raw_fmt = c("m/d/y")
  ) %>%
  # Map AESTDTC using assign_datetime, raw_var=IT.AESTDAT
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "IT.AESTDAT",
    tgt_var = "AESTDTC",
    raw_fmt = c("m/d/y"),
    id_vars = oak_id_vars()
  ) %>%
  # Map AEENDTC using assign_datetime, raw_var=IT.AEENDAT
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "IT.AEENDAT",
    tgt_var = "AEENDTC",
    raw_fmt = c("m/d/y"),
    id_vars = oak_id_vars()
  )

Sample of Data

Repeat Map Topic and Map Rest

There is only one topic variable in this raw data source, and there are no additional topic variable mappings. Users can proceed to the next step. This is required only if there is more than one topic variable to map.

Create SDTM derived variables

The SDTM derived variables or any SDTM mapping that is applicable to all the records in the ae dataset produced in the previous step cam be created now.

ae <- ae %>%
  dplyr::mutate(
    STUDYID = ae_raw$STUDY,
    DOMAIN = "AE",
    USUBJID = paste0("01-", ae_raw$PATNUM),
    AELLT = ae_raw$AELLT,
    AELLTCD = ae_raw$AELLTCD,
    AEDECOD = ae_raw$AEDECOD,
    AEPTCD = ae_raw$AEPTCD,
    AEHLT = ae_raw$AEHLT,
    AEHLTCD = ae_raw$AEHLTCD,
    AEHLGT = ae_raw$AEHLGT,
    AEHLGTCD = ae_raw$AEHLGTCD,
    AEBODSYS = ae_raw$AEBODSYS,
    AEBDSYCD = ae_raw$AEBDSYCD,
    AESOC = ae_raw$AESOC,
    AESOCCD = ae_raw$AESOCCD,
    AETERM = toupper(AETERM)
  ) %>%
  derive_seq(
    tgt_var = "AESEQ",
    rec_vars = c("USUBJID", "AETERM")
  ) %>%
  derive_study_day(
    sdtm_in = .,
    dm_domain = dm,
    tgdt = "AESTDTC",
    refdt = "RFXSTDTC",
    study_day_var = "AESTDY"
  ) %>%
  derive_study_day(
    sdtm_in = .,
    dm_domain = dm,
    tgdt = "AEENDTC",
    refdt = "RFXENDTC",
    study_day_var = "AEENDY"
  ) %>%
  select(
    "STUDYID", "DOMAIN", "USUBJID", "AESEQ", "AETERM", "AELLT", "AELLTCD", "AEDECOD", "AEPTCD", "AEHLT", "AEHLTCD", "AEHLGT",
    "AEHLGTCD", "AEBODSYS", "AEBDSYCD", "AESOC", "AESOCCD", "AESEV", "AESER", "AEACN", "AEREL", "AEOUT", "AESCAN", "AESCONG",
    "AESDISAB", "AESDTH", "AESHOSP", "AESLIFE", "AESOD", "AEDTC", "AESTDTC", "AEENDTC", "AESTDY", "AEENDY"
  )

Sample of Data

Add Labels and Attributes

Yet to be developed. Please refer to {metatools} package to investigate options.

VS
ADSL
Source Code
---
title: "AE"
order: 3
---

```{r setup script, include=FALSE, purl=FALSE}
invisible_hook_purl <- function(before, options, ...) {
  knitr::hook_purl(before, options, ...)
  NULL
}
knitr::knit_hooks$set(purl = invisible_hook_purl)
source("functions/print_df.R")
```

# Introduction

This article describes how to create an events SDTM domain using the `{sdtm.oak}` package. Examples are currently presented and tested in the context of the `AE` domain.

Before reading this article, it is recommended that users review some of the articles in the package documentation of `{sdtm.oak}` to understand some of the key concepts:
[Algorithms & Sub-Algorithms](https://pharmaverse.github.io/sdtm.oak/articles/algorithms.html),
[Creating an Interventions Domain](https://pharmaverse.github.io/sdtm.oak/articles/interventions_domain.html), which provides a detailed explanation of various concepts in `{sdtm.oak}`, such as `oak_id_vars`, `condition_add`, etc. It also offers guidance on which mapping algorithms or functions to use for different mappings and provides a more detailed explanation of how these mapping algorithms or functions work.

In this article, we will dive directly into programming and provide further explanation only where it is required.

# Programming workflow

* [Read in data](#readdata)
* [Create oak_id_vars](#oakidvars)
* [Read in CT](#readct)
* [Map Topic Variable](#maptopic)
* [Map Rest of the Variables](#maprest)
* [Repeat Map Topic and Map Rest](#repeatsteps)

Repeat the above steps for different raw datasets before proceeding with the below steps.

* [Create SDTM derived variables](#derivedvars)
* [Add Labels and Attributes](#attributes)

## Read in data {#readdata}

Read all the raw datasets into the environment. In this example, the raw dataset name is `ae_raw`. Users can read them from the `{pharmaverseraw}` package using the below code:

```{r setup, message=FALSE, warning=FALSE, results='hold'}
library(sdtm.oak)
library(pharmaverseraw)
library(dplyr)

ae_raw <- pharmaverseraw::ae_raw
```

Adverse Events Raw dataset
```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(ae_raw, n = 5)
```


Read in the DM domain

```{r}
dm <- pharmaversesdtm::dm
```

### SDTM aCRF

The mock up of the Adverse Events aCRF can be viewed [here](https://github.com/pharmaverse/pharmaverseraw/blob/main/vignettes/articles/aCRFs/AdverseEvent_aCRF.pdf):

## Create oak_id_vars {#oakidvars}

```{r}
ae_raw <- ae_raw %>%
  generate_oak_id_vars(
    pat_var = "PATNUM",
    raw_src = "ae_raw"
  )
```

```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(ae_raw, n = 5)
```

## Read in CT {#readct}

Controlled Terminology is part of the SDTM specification and it is prepared by the user.  In this example, the study controlled terminology name is `sdtm_ct.csv`. Users can read it from the package using the below code:
```{r, echo = TRUE}
study_ct <- read.csv("metadata/sdtm_ct.csv")
```

```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(study_ct, n = 10)
```

## Map Topic Variable {#maptopic}

The topic variable is mapped as a first step in the mapping process. It is the primary variable in the SDTM domain. The rest of the variables add further definition to the topic variable. In this example, the topic variable is `AETERM`. It is mapped from the raw dataset column `IT.AETERM`. The mapping logic is `Map the collected value in the ae_raw dataset IT.AETERM variable to AE.AETERM`.

This mapping does not involve any controlled terminology. The `sdtm.oak::assign_no_ct()` function is used for mapping. Once the topic variable is mapped, the Qualifier, Identifier, and Timing variables can be mapped.

```{r}
ae <-
  # Derive topic variable
  # Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM
  assign_no_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AETERM",
    tgt_var = "AETERM",
    id_vars = oak_id_vars()
  )
```


```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(ae, n = 10)
```

## Map Rest of the Variables {#maprest}

The Qualifiers, Identifiers, and Timing Variables can be mapped in any order. 

```{r eval=TRUE}
ae <- ae %>%
  # Map AEOUT using assign_ct, raw_var=AEOUTCOME, tgt_var=AEOUT
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AEOUTCOME",
    tgt_var = "AEOUT",
    ct_spec = study_ct,
    ct_clst = "C66768",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESEV using assign_no_ct, raw_var=IT.AESEV, tgt_var=AESEV
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESEV",
    tgt_var = "AESEV",
    ct_spec = study_ct,
    ct_clst = "C66769",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESER using assign_no_ct, raw_var=IT.AESER, tgt_var=AESER
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESER",
    tgt_var = "AESER",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEACN using assign_no_ct, raw_var=IT.AEACN, tgt_var=AEACN
  assign_no_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AEACN",
    tgt_var = "AEACN",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEREL using assign_ct, raw_var=IT.AEREL, tgt_var=AEREL
  # User-added codelist is in the ct,
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AEREL",
    tgt_var = "AEREL",
    ct_spec = study_ct,
    ct_clst = "AEREL",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESCAN using assign_ct, raw_var=AESCAN, tgt_var=AESCAN
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESCAN",
    tgt_var = "AESCAN",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESCNO using assign_ct, raw_var=AESCNO, tgt_var=AESCNO
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESCNO",
    tgt_var = "AESCONG",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEDIS using assign_ct, raw_var=AEDIS, tgt_var=AEDIS
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AEDIS",
    tgt_var = "AESDISAB",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESDTH using assign_ct, raw_var=IT.AESDTH, tgt_var=AESDTH
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESDTH",
    tgt_var = "AESDTH",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESHOSP using assign_ct, raw_var=IT.AESHOSP, tgt_var=AESHOSP
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESHOSP",
    tgt_var = "AESHOSP",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESLIFE using assign_ct, raw_var=IT.AESLIFE, tgt_var=AESLIFE
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AESLIFE",
    tgt_var = "AESLIFE",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESOD using assign_ct, raw_var=AESOD, tgt_var=AESOD
  assign_ct(
    raw_dat = ae_raw,
    raw_var = "AESOD",
    tgt_var = "AESOD",
    ct_spec = study_ct,
    ct_clst = "C66742",
    id_vars = oak_id_vars()
  ) %>%
  # Map AEDTC using assign_datetime, raw_var=AEDTCOL
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "AEDTCOL",
    tgt_var = "AEDTC",
    raw_fmt = c("m/d/y")
  ) %>%
  # Map AESTDTC using assign_datetime, raw_var=IT.AESTDAT
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "IT.AESTDAT",
    tgt_var = "AESTDTC",
    raw_fmt = c("m/d/y"),
    id_vars = oak_id_vars()
  ) %>%
  # Map AEENDTC using assign_datetime, raw_var=IT.AEENDAT
  assign_datetime(
    raw_dat = ae_raw,
    raw_var = "IT.AEENDAT",
    tgt_var = "AEENDTC",
    raw_fmt = c("m/d/y"),
    id_vars = oak_id_vars()
  )
```

```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(ae, n = 10)
```

## Repeat Map Topic and Map Rest {#repeatsteps}

There is only one topic variable in this raw data source, and there are no additional topic variable mappings. Users can proceed to the next step. This is required only if there is more than one topic variable to map.

## Create SDTM derived variables {#derivedvars}

The SDTM derived variables or any SDTM mapping that is applicable to all the records in the `ae` dataset produced in the previous step cam be created now. 

```{r}
ae <- ae %>%
  dplyr::mutate(
    STUDYID = ae_raw$STUDY,
    DOMAIN = "AE",
    USUBJID = paste0("01-", ae_raw$PATNUM),
    AELLT = ae_raw$AELLT,
    AELLTCD = ae_raw$AELLTCD,
    AEDECOD = ae_raw$AEDECOD,
    AEPTCD = ae_raw$AEPTCD,
    AEHLT = ae_raw$AEHLT,
    AEHLTCD = ae_raw$AEHLTCD,
    AEHLGT = ae_raw$AEHLGT,
    AEHLGTCD = ae_raw$AEHLGTCD,
    AEBODSYS = ae_raw$AEBODSYS,
    AEBDSYCD = ae_raw$AEBDSYCD,
    AESOC = ae_raw$AESOC,
    AESOCCD = ae_raw$AESOCCD,
    AETERM = toupper(AETERM)
  ) %>%
  derive_seq(
    tgt_var = "AESEQ",
    rec_vars = c("USUBJID", "AETERM")
  ) %>%
  derive_study_day(
    sdtm_in = .,
    dm_domain = dm,
    tgdt = "AESTDTC",
    refdt = "RFXSTDTC",
    study_day_var = "AESTDY"
  ) %>%
  derive_study_day(
    sdtm_in = .,
    dm_domain = dm,
    tgdt = "AEENDTC",
    refdt = "RFXENDTC",
    study_day_var = "AEENDY"
  ) %>%
  select(
    "STUDYID", "DOMAIN", "USUBJID", "AESEQ", "AETERM", "AELLT", "AELLTCD", "AEDECOD", "AEPTCD", "AEHLT", "AEHLTCD", "AEHLGT",
    "AEHLGTCD", "AEBODSYS", "AEBDSYCD", "AESOC", "AESOCCD", "AESEV", "AESER", "AEACN", "AEREL", "AEOUT", "AESCAN", "AESCONG",
    "AESDISAB", "AESDTH", "AESHOSP", "AESLIFE", "AESOD", "AEDTC", "AESTDTC", "AEENDTC", "AESTDY", "AEENDY"
  )
```

```{r eval=TRUE, echo=FALSE, purl=FALSE}
print_df(ae, n = 10)
```

## Add Labels and Attributes {#attributes}

Yet to be developed. Please refer to `{metatools}` package to investigate options.
 
Cookie Preferences