De-Mystifying R Programming in Clinical Trials

A blog highlighting the benefits/limitations of using R Programming and using the right tools to create value
Community
Author

Venkatesan Balu

Published

May 2, 2024

Introduction

The use of R programming in clinical trials has not always been the most popular and obvious in the past. Despite experiencing significant growth in recent years, the adoption of R programming in clinical trials is not as widespread and evident as anticipated. Practical implementation faces obstacles due to various factors, including occasional misunderstandings, particularly in the context of validation, and a notable lack of awareness regarding its capabilities. However, despite these challenges, R is steadily establishing a growing presence within the pharmaceutical industry.

Opportunities for R Programming in Clinical Trials

Although R is versatile and applicable in various settings, it is commonly associated with scientific computing and statistics. In the context of clinical trials, where researchers aim to understand and enhance drug development and testing processes, R has become a prominent tool for analyzing the collected data. While SAS® has been a longstanding programming language for clinical trials, the industry has been exploring alternative options. There is a quest for sustainable technology and tools that can effectively address industry challenges.

To drive innovation, there is a need to move away from traditional, inefficient processes and tools toward solutions that are efficient, simple, easy to implement, reliable, and cost-effective. Collaboration among industry stakeholders is crucial to develop a robust technology ecosystem and establish consensus on validation and regulatory benchmarks. Equally vital is preparing the workforce with the necessary skillsets to meet future demands.

SAS® or R Programming: Which is Better?

SAS® or R?

The ongoing debate in the programming community revolves around whether to replace SAS® with R, use both, or explore other alternatives like Python. Instead of adopting an either-or scenario, leveraging the strengths of each programming language for specific Data Science problems is recommended, recognizing that one size does not fit all. Despite the challenges early adopters of R have faced in regulatory compliance, there have been notable successes that highlight the benefits and potential of using R in regulated industries. Early adopters of R have faced challenges, with regulatory compliance for R packages being a common issue.

For R to be considered for tasks related to regulatory submission, a rigorous risk assessment of R packages, feasibility analysis, and the establishment of processes for R usage through pilot projects with necessary documentation becomes imperative. We see great progress in this area through efforts such as the R Consortium R Submissions WG.

Benefits of Using R Programming

R, as a language and environment for statistical computing and graphics, possesses characteristics that make it a potentially powerful tool for Data Analysis. With approximately 2 million users worldwide and three decades of legacy, R stands out as open-source software receiving substantial support from the community. Its availability under the GNU General Public License and extensive documentation contribute to its strength. R is versatile, running on various platforms, offering a wide array of statistical and graphical techniques, and its ease of producing publication-quality plots enhances its appeal.

The pharmaceutical industry has witnessed the emergence of various R packages tailored for Clinical Trial reporting. Examples include {rtables} for creating tables for reporting clinical trials, {admiral} for CDISC ADaM, {pkglite} to support eSubmission, and many others. Pharmaverse packages cater to different aspects of clinical trial data analysis, showcasing the versatility of R in this domain.

This article talks more about use of R in clinical trials and how this will be used by taking advantages of open source of R. The FDA emphasizes the need for fully documenting software packages used for statistical analysis in submissions. The use of R poses specific challenges related to validation, given its free and open-source nature. To address this, the R Validation Hub has released guidance documents focusing in this space.

Given that the cost of the R package is non-chargeable, it can also serve as a potential tool for API integration. For instance, in signal detection, R packages can prove to be valuable tools due to the intricate derivation process for EBGM in the Bayesian approach, which aims to mitigate false positive signals resulting from multiple comparisons. The computation adjusts the observed-to-expected reporting ratio for temporal trends and confounding variables such as age and sex. While both methods can estimate this, the accessibility of R as free software enables easy integration into any system as an API or for macro estimation purposes without any copyrights issue. As always though consult the license of any package being used to be sure your usage is in compliance.

Identifying the Limitations in Using R Programming

It is crucial to note that software cost is essential to any company, including Pharma and Biotechs. While R and RStudio® are free and SAS® requires an annual license, using R instead of SAS® may not always lower costs. The cost of software is only one part of the equation. To be used in a highly regulated industry such as pharmaceuticals, software validation, maintenance, and support are critical, and their costs need to be considered. Although R is free and open source, it comes with a learning curve, and in short term the industry might face a shortage of experienced pharma R programmers compared to those familiar with SAS®.

Leveraging the Right Tools to Capture Value

Capturing the value of R programming starts with a clear vision for its use and a systematic approach to identifying and prioritizing the needs in the industry. Clinical Data Science is evolving rapidly, and the industry actively seeks alternative solutions to unlock valuable insights from diverse datasets. Recognizing the need for innovation, collaboration, and efficient tools is crucial. Rather than viewing SAS®, R, and Python as mutually exclusive, leveraging the strengths of each for appropriate Data Science problems provides a nuanced and effective approach.

Ensuring data quality, scientific integrity, and regulatory compliance through risk assessment frameworks, validation, and documentation are imperative in this dynamic landscape. Pharmaverse is also actively steering the pharmaceutical industry’s path by pioneering connections and advocating for R, thus exemplifying the broader trend of industries acknowledging the value and potential of open-source tools in tackling complex challenges.

Leveraging the Right Tools

Last updated

2025-01-17 08:42:14.053755

Details

Reuse

Citation

BibTeX citation:
@online{balu2024,
  author = {Balu, Venkatesan},
  title = {De-Mystifying {R} {Programming} in {Clinical} {Trials}},
  date = {2024-05-02},
  url = {https://pharmaverse.github.io/blog/posts/2024-04-15_de-_mystifying__.../de-_mystifying__r__programming_in__clinical__trials.html},
  langid = {en}
}
For attribution, please cite this work as:
Balu, Venkatesan. 2024. “De-Mystifying R Programming in Clinical Trials.” May 2, 2024. https://pharmaverse.github.io/blog/posts/2024-04-15_de-_mystifying__.../de-_mystifying__r__programming_in__clinical__trials.html.