Skip to contents

Extract Duplicate Records

Usage

extract_duplicate_records(dataset, by_vars)

Arguments

dataset

Input dataset

The variables specified by the by_vars argument are expected to be in the dataset.

by_vars

Grouping variables

Defines groups of records in which to look for duplicates.

Permitted Values: list of variables created by exprs() e.g. exprs(USUBJID, VISIT)

Value

A data.frame of duplicate records within dataset

Examples

data(admiral_adsl)

# Duplicate the first record
adsl <- rbind(admiral_adsl[1L, ], admiral_adsl)

extract_duplicate_records(adsl, exprs(USUBJID))
#> # A tibble: 2 × 50
#>   USUBJID     STUDYID  SUBJID RFSTDTC RFENDTC RFXSTDTC RFXENDTC RFICDTC RFPENDTC
#>   <chr>       <chr>    <chr>  <chr>   <chr>   <chr>    <chr>    <chr>   <chr>   
#> 1 01-701-1015 CDISCPI… 1015   2014-0… 2014-0… 2014-01… 2014-07… NA      2014-07…
#> 2 01-701-1015 CDISCPI… 1015   2014-0… 2014-0… 2014-01… 2014-07… NA      2014-07…
#> # ℹ 41 more variables: DTHDTC <chr>, DTHFL <chr>, SITEID <chr>, AGE <dbl>,
#> #   AGEU <chr>, SEX <chr>, RACE <chr>, ETHNIC <chr>, ARMCD <chr>, ARM <chr>,
#> #   ACTARMCD <chr>, ACTARM <chr>, COUNTRY <chr>, DMDTC <chr>, DMDY <dbl>,
#> #   TRT01P <chr>, TRT01A <chr>, TRTSDTM <dttm>, TRTSTMF <chr>, TRTEDTM <dttm>,
#> #   TRTETMF <chr>, TRTSDT <date>, TRTEDT <date>, TRTDURD <dbl>, SCRFDT <date>,
#> #   EOSDT <date>, EOSSTT <chr>, FRVDT <date>, RANDDT <date>, DTHDT <date>,
#> #   DTHADY <dbl>, LDDTHELD <dbl>, LSTALVDT <date>, AGEGR1 <fct>, SAFFL <chr>, …