Imputation partial date portion of a '--DTC'
variable based on user input.
Usage
impute_dtc_dt(
dtc,
highest_imputation = "n",
date_imputation = "first",
min_dates = NULL,
max_dates = NULL,
preserve = FALSE
)
Arguments
- dtc
The
'--DTC'
date to imputeA character date is expected in a format like
yyyy-mm-dd
oryyyy-mm-ddThh:mm:ss
. Trailing components can be omitted and-
is a valid "missing" value for any component.- highest_imputation
Highest imputation level
The
highest_imputation
argument controls which components of the DTC value are imputed if they are missing. All components up to the specified level are imputed.If a component at a higher level than the highest imputation level is missing,
NA_character_
is returned. For example, forhighest_imputation = "D"
"2020"
results inNA_character_
because the month is missing.If
"n"
is specified no imputation is performed, i.e., if any component is missing,NA_character_
is returned.If
"Y"
is specified,date_imputation
should be"first"
or"last"
andmin_dates
ormax_dates
should be specified respectively. Otherwise,NA_character_
is returned if the year component is missing.Permitted Values:
"Y"
(year, highest level),"M"
(month),"D"
(day),"n"
(none, lowest level)- date_imputation
The value to impute the day/month when a datepart is missing.
A character value is expected, either as a
format with month and day specified as
"mm-dd"
: e.g."06-15"
for the 15th of June (The year can not be specified; for imputing the year"first"
or"last"
together withmin_dates
ormax_dates
argument can be used (see examples).),or as a keyword:
"first"
,"mid"
,"last"
to impute to the first/mid/last day/month. If"mid"
is specified, missing components are imputed as the middle of the possible range:If both month and day are missing, they are imputed as
"06-30"
(middle of the year).If only day is missing, it is imputed as
"15"
(middle of the month).
The argument is ignored if
highest_imputation
is less then"D"
.- min_dates
Minimum dates
A list of dates is expected. It is ensured that the imputed date is not before any of the specified dates, e.g., that the imputed adverse event start date is not before the first treatment date. Only dates which are in the range of possible dates of the
dtc
value are considered. The possible dates are defined by the missing parts of thedtc
date (see example below). This ensures that the non-missing parts of thedtc
date are not changed. A date or date-time object is expected. For exampleimpute_dtc_dtm( "2020-11", min_dates = list( ymd_hms("2020-12-06T12:12:12"), ymd_hms("2020-11-11T11:11:11") ), highest_imputation = "M" )
returns
"2020-11-11T11:11:11"
because the possible dates for"2020-11"
range from"2020-11-01T00:00:00"
to"2020-11-30T23:59:59"
. Therefore"2020-12-06T12:12:12"
is ignored. Returning"2020-12-06T12:12:12"
would have changed the month although it is not missing (in thedtc
date).- max_dates
Maximum dates
A list of dates is expected. It is ensured that the imputed date is not after any of the specified dates, e.g., that the imputed date is not after the data cut off date. Only dates which are in the range of possible dates are considered. A date or date-time object is expected.
- preserve
Preserve day if month is missing and day is present
For example
"2019---07"
would return"2019-06-07
ifpreserve = TRUE
(anddate_imputation = "MID"
).Permitted Values:
TRUE
,FALSE
See also
Date/Time Computation Functions that returns a vector:
compute_age_years()
,
compute_dtf()
,
compute_duration()
,
compute_tmf()
,
convert_date_to_dtm()
,
convert_dtc_to_dt()
,
convert_dtc_to_dtm()
,
impute_dtc_dtm()
Examples
library(lubridate)
dates <- c(
"2019-07-18T15:25:40",
"2019-07-18T15:25",
"2019-07-18T15",
"2019-07-18",
"2019-02",
"2019",
"2019",
"2019---07",
""
)
# No date imputation (highest_imputation defaulted to "n")
impute_dtc_dt(dtc = dates)
#> [1] "2019-07-18" "2019-07-18" "2019-07-18" "2019-07-18" NA
#> [6] NA NA NA NA
# Impute to first day/month if date is partial
impute_dtc_dt(
dtc = dates,
highest_imputation = "M"
)
#> [1] "2019-07-18" "2019-07-18" "2019-07-18" "2019-07-18" "2019-02-01"
#> [6] "2019-01-01" "2019-01-01" "2019-01-01" NA
# Same as above
impute_dtc_dt(
dtc = dates,
highest_imputation = "M",
date_imputation = "01-01"
)
#> [1] "2019-07-18" "2019-07-18" "2019-07-18" "2019-07-18" "2019-02-01"
#> [6] "2019-01-01" "2019-01-01" "2019-01-01" NA
# Impute to last day/month if date is partial
impute_dtc_dt(
dtc = dates,
highest_imputation = "M",
date_imputation = "last",
)
#> [1] "2019-07-18" "2019-07-18" "2019-07-18" "2019-07-18" "2019-02-28"
#> [6] "2019-12-31" "2019-12-31" "2019-12-31" NA
# Impute to mid day/month if date is partial
impute_dtc_dt(
dtc = dates,
highest_imputation = "M",
date_imputation = "mid"
)
#> [1] "2019-07-18" "2019-07-18" "2019-07-18" "2019-07-18" "2019-02-15"
#> [6] "2019-06-30" "2019-06-30" "2019-06-30" NA
# Impute a date and ensure that the imputed date is not before a list of
# minimum dates
impute_dtc_dt(
"2020-12",
min_dates = list(
ymd("2020-12-06"),
ymd("2020-11-11")
),
highest_imputation = "M"
)
#> [1] "2020-12-06"
# Impute completely missing dates (only possible if min_dates or max_dates is specified)
impute_dtc_dt(
c("2020-12", NA_character_),
min_dates = list(
ymd("2020-12-06", "2020-01-01"),
ymd("2020-11-11", NA)
),
highest_imputation = "Y"
)
#> [1] "2020-12-06" "2020-01-01"