# For Central Subfield Thickness
$ARM <- factor(adoe_CST$ARM, levels = c("Placebo", "Xanomeline Low Dose", "Xanomeline High Dose"))
adoe_CST
ggplot(data = subset(adoe_CST, ARM != "Screen Failure"), aes(x = ARM, y = AVAL)) +
geom_bar(stat = "identity") +
xlab("Active Arm") +
ylab("Central Subfield Thickness / um") +
ylim(0, max(adoe_CST$AVAL))
As part of my placement year as a Data Sciences Industrial Placement student in Biostatistics at Roche in the UK, I was required to produce a “Business Project” and present it to the entire Data Sciences department. I decided to use pharmaverseadam to design a brand-new R training, “trainStats”, for junior Biostatisticians, since the package includes realistic ADaM datasets that are ideal for statistical analyses. For maximum efficiency, I tied my business project with a quantitative project report, due August 2024, for my undergraduate degree in Mathematics, Operational Research and Statistics at Cardiff University.
The quantitative project report investigates statistical analyses on preliminary clinical trial data using the R Studio software as instructed by the trainStats program I have authored, to help ease new Biostatisticians in the industry. The software was built considering the needs of people who are new to the industry and are keen to pursue a career in Biostatistics.
I had a smooth experience with the
Throughout the trainStats documentation, I have primarily used the adoe_ophtha
ADaM dataset (containing ophthalmology safety data) from adoe_ophtha
contains visit day, active arm and endpoint data, it was ideal to use for training purposes. In addition, I did use the adsl
dataset too, to encourage trainStats users to join and merge datasets, taking into account patient demographics such as age. Here is a snippet of the code I wrote to generate bar charts by active arm for the “Central Subfield Thickness” endpoint:
Below, is another example of the code I wrote to produce a boxplot displaying the analysis value of Central Subfield Thickness by patient visit days:
<- adoe_ophtha %>%
adoe_CST filter(PARAMCD == "SCSUBTH")
<- adoe_ophtha %>%
adoe_DR filter(PARAMCD == "SDRSSR")
# Boxplots for each visit day
boxplot(AVAL ~ AVISITN,
data = adoe_CST,
main = "Different boxplots for each visit day",
xlab = "Visit Number",
ylab = "Central Subfield Thickness/ um",
col = "orange",
border = "brown"
)
As you can see above, both of these code snapshots display the importance of clear logic and reasoning whilst coding by implementing strong data visualization techniques such as commenting. The code is simple and I personally found that using the
To further develop and improve the adoe_ophtha
dataset would be invaluable for future application and statistical analyses. Often ADOE
datasets have several endpoints but the adoe_ophtha
dataset only included 2 clinical parameters, namely “Central Subfield Thickness” and “Diabetic Retinopathy Severity Scale”. In addition, since the data is synthetic and randomly generated, the outputs had no significant correlations or trends from a statistical perspective in terms of disease progression or measures of central tendencies. Although, in this case, the emphasis was on understanding logic and reasoning whilst programming the statistical outputs, I experienced difficulties analysing the data quantitatively in my university report due to the high variation in data. Going forward, if there is a method to simulate the data less randomly, then that may be more useful for future dummy analyses on
Overall, my experience of using the
Last updated
2025-01-17 08:40:58.667136
Details
Reuse
Citation
@online{parashar2024,
author = {Parashar, Syon},
title = {Undergraduate {University} {Statistics} {Report} Using
Pharmaverseadam Data},
date = {2024-09-16},
url = {https://pharmaverse.github.io/blog/posts/2024-09-16_university_undergraduate_report/university_undergraduate_report.html},
langid = {en}
}