Skip to contents

[Deprecated]

This function is deprecated, please use derive_vars_dt() and derive_vars_merged() instead.

Merge a imputed date variable and date imputation flag from a dataset to the input dataset. The observations to merge can be selected by a condition and/or selecting the first or last observation for each by group.

Usage

derive_vars_merged_dt(
  dataset,
  dataset_add,
  by_vars,
  order = NULL,
  new_vars_prefix,
  filter_add = NULL,
  mode = NULL,
  dtc,
  date_imputation = NULL,
  flag_imputation = "auto",
  min_dates = NULL,
  max_dates = NULL,
  preserve = FALSE,
  check_type = "warning",
  duplicate_msg = NULL
)

Arguments

dataset

Input dataset

The variables specified by the by_vars argument are expected.

dataset_add

Additional dataset

The variables specified by the by_vars, the dtc, and the order argument are expected.

by_vars

Grouping variables

The input dataset and the selected observations from the additional dataset are merged by the specified by variables. The by variables must be a unique key of the selected observations.

Permitted Values: list of variables created by vars()

order

Sort order

If the argument is set to a non-null value, for each by group the first or last observation from the additional dataset is selected with respect to the specified order. The imputed date variable can be specified as well (see examples below).

Please note that NA is considered as the last value. I.e., if a order variable is NA and mode = "last", this observation is chosen while for mode = "first" the observation is chosen only if there are no observations where the variable is not 'NA'.

Default: NULL

Permitted Values: list of variables or desc(<variable>) function calls created by vars(), e.g., vars(ADT, desc(AVAL) or NULL

new_vars_prefix

Prefix used for the output variable(s).

A character scalar is expected. For the date variable "DT" is appended to the specified prefix and for the date imputation flag "DTF". I.e., for new_vars_prefix = "AST" the variables ASTDT and ASTDTF are created.

filter_add

Filter for additional dataset (dataset_add)

Only observations fulfilling the specified condition are taken into account for merging. If the argument is not specified, all observations are considered.

Default: NULL

Permitted Values: a condition

mode

Selection mode

Determines if the first or last observation is selected. If the order argument is specified, mode must be non-null.

If the order argument is not specified, the mode argument is ignored.

Default: NULL

Permitted Values: "first", "last", NULL

dtc

The '--DTC' date to impute

A character date is expected in a format like yyyy-mm-dd or yyyy-mm-ddThh:mm:ss. Trailing components can be omitted and - is a valid "missing" value for any component.

date_imputation

The value to impute the day/month when a datepart is missing.

A character value is expected, either as a

  • format with month and day specified as "mm-dd": e.g. "06-15" for the 15th of June (The year can not be specified; for imputing the year "first" or "last" together with min_dates or max_dates argument can be used (see examples).),

  • or as a keyword: "first", "mid", "last" to impute to the first/mid/last day/month.

The argument is ignored if highest_imputation is less then "D".

Default: "first"

flag_imputation

Whether the date imputation flag must also be derived.

If "auto" is specified, the date imputation flag is derived if the date_imputation argument is not null.

Default: "auto"

Permitted Values: "auto", "date" or "none"

min_dates

Minimum dates

A list of dates is expected. It is ensured that the imputed date is not before any of the specified dates, e.g., that the imputed adverse event start date is not before the first treatment date. Only dates which are in the range of possible dates of the dtc value are considered. The possible dates are defined by the missing parts of the dtc date (see example below). This ensures that the non-missing parts of the dtc date are not changed. A date or date-time object is expected. For example

impute_dtc_dtm(
  "2020-11",
  min_dates = list(
   ymd_hms("2020-12-06T12:12:12"),
   ymd_hms("2020-11-11T11:11:11")
  ),
  highest_imputation = "M"
)

returns "2020-11-11T11:11:11" because the possible dates for "2020-11" range from "2020-11-01T00:00:00" to "2020-11-30T23:59:59". Therefore "2020-12-06T12:12:12" is ignored. Returning "2020-12-06T12:12:12" would have changed the month although it is not missing (in the dtc date).

max_dates

Maximum dates

A list of dates is expected. It is ensured that the imputed date is not after any of the specified dates, e.g., that the imputed date is not after the data cut off date. Only dates which are in the range of possible dates are considered. A date or date-time object is expected.

preserve

Preserve day if month is missing and day is present

For example "2019---07" would return "2019-06-07 if preserve = TRUE (and date_imputation = "MID").

Permitted Values: TRUE, FALSE

Default: FALSE

check_type

Check uniqueness?

If "warning" or "error" is specified, the specified message is issued if the observations of the (restricted) additional dataset are not unique with respect to the by variables and the order.

Default: "warning"

Permitted Values: "none", "warning", "error"

duplicate_msg

Message of unique check

If the uniqueness check fails, the specified message is displayed.

Default:

paste("Dataset `dataset_add` contains duplicate records with respect to",
      enumerate(vars2chr(by_vars)))

Value

The output dataset contains all observations and variables of the input dataset and additionally the variable <new_vars_prefix>DT and optionally the variable <new_vars_prefix>DTF derived from the additional dataset (dataset_add).

Details

  1. The additional dataset is restricted to the observations matching the filter_add condition.

  2. The date variable and if requested, the date imputation flag is added to the additional dataset.

  3. If order is specified, for each by group the first or last observation (depending on mode) is selected.

  4. The date and flag variables are merged to the input dataset.

Author

Stefan Bundfuss