library(dplyr)
# Create categorical variables, remove screen failures, and assign column labels
adsl <- pharmaverseadam::adsl |>
filter(!ACTARM %in% "Screen Failure") |>
mutate(
SEX = case_match(SEX, "M" ~ "MALE", "F" ~ "FEMALE"),
AGEGR1 =
case_when(
between(AGE, 18, 40) ~ "18-40",
between(AGE, 41, 64) ~ "41-64",
AGE > 64 ~ ">=65"
) |>
factor(levels = c("18-40", "41-64", ">=65"))
) |>
labelled::set_variable_labels(AGE = "Age (yr)",
AGEGR1 = "Age group",
SEX = "Sex",
RACE = "Race")
Demographic Table
Introduction
This guide will show you how pharmaverse packages, along with some from tidyverse, can be used to create a Demographic table, using the {pharmaverseadam}
ADSL
data as an input.
In the examples below, we illustrate two general approaches for creating a demographics table. The first utilizes Analysis Results Datasets—part of the emerging CDISC Analyis Results Standard. The second is the classic method of creating summary tables directly from a data set.
Data preprocessing
Now we will add some pre-processing to create some extra formatted variables ready for display in the table.
{gtsummary} & {cards}
In the example below, we will use the {gtsummary} and {cards} packages to create a demographics tables.
- The {cards} package creates Analysis Results Datasets (ARDs, which are a part of the CDISC Analysis Results Standard).
- The {gtsummary} utilizes ARDs to create tables.
ARD ➡ Table
In the example below, we first build an ARD with the needed summary statistics using {cards}. Then, we use the ARD to build the demographics table with {gtsummary}.
library(cards)
library(gtsummary)
theme_gtsummary_compact() # reduce default padding and font size for a gt table
# build the ARD with the needed summary statistics using {cards}
ard <-
ard_stack(
adsl,
ard_continuous(variables = AGE),
ard_categorical(variables = c(AGEGR1, SEX, RACE)),
.by = ACTARM, # split results by treatment arm
.attributes = TRUE # optionally include column labels in the ARD
)
# use the ARD to create a demographics table using {gtsummary}
tbl_ard_summary(
cards = ard,
by = ACTARM,
include = c(AGE, AGEGR1, SEX, RACE),
type = AGE ~ "continuous2",
statistic = AGE ~ c("{N}", "{mean} ({sd})", "{median} ({p25}, {p75})", "{min}, {max}")
) |>
bold_labels() |>
modify_header(all_stat_cols() ~ "**{level}** \nN = {n}") |> # add Ns to header
modify_footnote(everything() ~ NA) # remove default footnote
Characteristic | Placebo N = 86 |
Xanomeline High Dose N = 72 |
Xanomeline Low Dose N = 96 |
---|---|---|---|
Age (yr) | |||
N | 86 | 72 | 96 |
Mean (SD) | 75.2 (8.6) | 73.8 (7.9) | 76.0 (8.1) |
Median (Q1, Q3) | 76.0 (69.0, 82.0) | 75.5 (70.0, 79.0) | 78.0 (71.0, 82.0) |
Min, Max | 52.0, 89.0 | 56.0, 88.0 | 51.0, 88.0 |
Age group | |||
18-40 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
41-64 | 14 (16.3%) | 11 (15.3%) | 8 (8.3%) |
>=65 | 72 (83.7%) | 61 (84.7%) | 88 (91.7%) |
Sex | |||
FEMALE | 53 (61.6%) | 35 (48.6%) | 55 (57.3%) |
MALE | 33 (38.4%) | 37 (51.4%) | 41 (42.7%) |
Race | |||
AMERICAN INDIAN OR ALASKA NATIVE | 0 (0.0%) | 1 (1.4%) | 0 (0.0%) |
BLACK OR AFRICAN AMERICAN | 8 (9.3%) | 9 (12.5%) | 6 (6.3%) |
WHITE | 78 (90.7%) | 62 (86.1%) | 90 (93.8%) |
Table ➡ ARD
One may also build the demographics in the classic way using gtsummary::tbl_summary()
from a data frame, then extract the ARD from the table object.
# build demographics table directly from a data frame
tbl <- adsl |> tbl_summary(by = ACTARM, include = c(AGE, AGEGR1, SEX, RACE))
# extract ARD from table object
gather_ard(tbl)[[1]] |> select(-gts_column) # removing column so ARD fits on page
{cards} data frame: 162 x 11
group1 group1_level variable variable_level stat_name stat_label stat
1 ACTARM Placebo SEX FEMALE n n 53
2 ACTARM Placebo SEX FEMALE N N 86
3 ACTARM Placebo SEX FEMALE p % 0.616
4 ACTARM Xanomeli… SEX FEMALE n n 35
5 ACTARM Xanomeli… SEX FEMALE N N 72
6 ACTARM Xanomeli… SEX FEMALE p % 0.486
7 ACTARM Xanomeli… SEX FEMALE n n 55
8 ACTARM Xanomeli… SEX FEMALE N N 96
9 ACTARM Xanomeli… SEX FEMALE p % 0.573
10 ACTARM Placebo SEX MALE n n 33
ℹ 152 more rows
ℹ Use `print(n = ...)` to see more rows
ℹ 4 more variables: context, fmt_fn, warning, error
{rtables} & {tern}
The packages used with a brief description of their purpose are as follows:
{rtables}
: designed to create and display complex tables with R.{tern}
: contains analysis functions to create tables and graphs used for clinical trial reporting.
After installation of packages, the first step is to load our pharmaverse packages and input data. Here, we are going to encode missing entries in a data frame adsl
.
Note that {tern}
depends on {rtables}
so the latter is automatically attached.
Now we create the demographic table.
vars <- c("AGE", "AGEGR1", "SEX", "RACE")
var_labels <- c(
"Age (yr)",
"Age group",
"Sex",
"Race"
)
lyt <- basic_table(show_colcounts = TRUE) |>
split_cols_by(var = "ACTARM") |>
add_overall_col("All Patients") |>
analyze_vars(
vars = vars,
var_labels = var_labels
)
result <- build_table(lyt, adsl2)
result
Placebo Xanomeline High Dose Xanomeline Low Dose All Patients
(N=86) (N=72) (N=96) (N=254)
————————————————————————————————————————————————————————————————————————————————————————————————————————————
Age (yr)
n 86 72 96 254
Mean (SD) 75.2 (8.6) 73.8 (7.9) 76.0 (8.1) 75.1 (8.2)
Median 76.0 75.5 78.0 77.0
Min - Max 52.0 - 89.0 56.0 - 88.0 51.0 - 88.0 51.0 - 89.0
Age group
n 86 72 96 254
18-40 0 0 0 0
41-64 14 (16.3%) 11 (15.3%) 8 (8.3%) 33 (13%)
>=65 72 (83.7%) 61 (84.7%) 88 (91.7%) 221 (87%)
Sex
n 86 72 96 254
FEMALE 53 (61.6%) 35 (48.6%) 55 (57.3%) 143 (56.3%)
MALE 33 (38.4%) 37 (51.4%) 41 (42.7%) 111 (43.7%)
Race
n 86 72 96 254
AMERICAN INDIAN OR ALASKA NATIVE 0 1 (1.4%) 0 1 (0.4%)
BLACK OR AFRICAN AMERICAN 8 (9.3%) 9 (12.5%) 6 (6.2%) 23 (9.1%)
WHITE 78 (90.7%) 62 (86.1%) 90 (93.8%) 230 (90.6%)