Unix versus SAS Time

This blog explores SAS and R date handling differences, focusing on epoch discrepancies and data types, and provides key tips for accurate date conversions to prevent a 10-year date shift.
Technical
Author

Céline Piraux

Published

July 16, 2024

Recent discussions within the R Consortium Submission Working Group have highlighted challenges in handling dates between SAS and Unix systems. In addition, during the CDISC Pilot Data update at the Phuse Test Data Factory, using R’s Unix time format resulted in dates (TRTSDT, TRTEDT, and ADT in adlbc.xpt) being mistakenly advanced by +10 years.

This blog post explores the differences in date handling between SAS and R, focusing on epoch discrepancies and data types. It also discusses key considerations for conversion tools to ensure accurate date conversions and maintain data integrity.

Epoch in R and SAS

An epoch is a reference point from which time is measured. In computing, it’s often used as the starting point for a system’s time-keeping, allowing for the easy calculation of time intervals. Different systems use different epochs, which can lead to confusion if not properly managed.

In SAS, the epoch is defined as midnight on January 1, 1960. (01JAN1960:00:00:00).

R uses the Unix epoch which begins epoch begins at midnight on January 1, 1970 (01JAN1970:00:00:00)

For example, a date with a value of 19725 corresponds to 03JAN2014 in SAS and 03JAN2024 in R.

Date Differences Between R and SAS

Another important aspect to consider is the difference in data types for numeric dates in R and SAS.

  • SAS: SAS only supports numeric and character data types. Therefore, numeric dates in SAS need a format applied to be human-readable.

  • R: In contrast, R has specific data types for dates, which allow numeric dates to be rendered with a date format directly in data frames. The data types in R are: “Date” for dates, “hms” or “difftime” for time, and “POSIXct” or “POSIXt” for datetime.

These differences can lead to discrepancies if not properly accounted for during data conversion, potentially resulting in a date shift of 10 years when transferring data between SAS and R.

Considerations for Conversion

To illustrate how date conversions between SAS and R can be managed, let’s look at the haven package in R. This package facilitates the conversion of data between SAS and R.

Dealing with Date Interoperability in Dataset-JSON

Dataset-JSON v1.1 could potentially be used in future data exchanges between SAS and R. Given the differing approaches to handling dates across programming languages, the challenge lies in managing dates within a programming language agnostic data exchange format.

What is Dataset-JSON? Dataset-JSON is a dataset exchange standard that uses JSON. It is designed to meet regulatory submission requirements and other dataset exchange scenarios. It has the potential to replace XPT as the default format for clinical and device data submissions to regulatory authorities.

To address interoperability issues across programming languages, options include storing numeric dates as ISO 8601 strings with a targetdatatype attribute to differentiate between character and numeric dates, or associating the epoch with the numeric date values. Refer to the Dataset-JSON specification for the latest date handling specifications. These approaches ensure that conversion tools receive sufficient information to handle dates correctly.

To ensure the correct handling of dates in Dataset-JSON, you can use the datasetjson R package, which is built to read and write CDISC Dataset JSON formatted datasets.

Conclusion

The 10-year date shift between R and SAS can be avoided by using appropriate conversion tools such as xportr and haven. To ensure correct date conversions, verify that the datetime variables in SAS and R have the correct data type and a date format for SAS. By paying attention to these details, you can maintain data integrity and avoid significant discrepancies during data conversion processes.

Last updated

2024-08-20 11:59:49.261654

Details

Reuse

Citation

BibTeX citation:
@online{piraux2024,
  author = {Piraux, Céline},
  title = {Unix Versus {SAS} {Time}},
  date = {2024-07-16},
  url = {https://pharmaverse.github.io/blog/posts/2024-07-08_unix_vs_sas_date.../unix_vs_sas_datetime.html},
  langid = {en}
}
For attribution, please cite this work as:
Piraux, Céline. 2024. “Unix Versus SAS Time.” July 16, 2024. https://pharmaverse.github.io/blog/posts/2024-07-08_unix_vs_sas_date.../unix_vs_sas_datetime.html.