Runs all three synthesis methods on the same data and returns a comparative validation table.
Usage
compare_methods(data, n = nrow(data), seed = NULL)Arguments
- data
A data frame of real data.
- n
Number of synthetic records. Default: same as input.
- seed
Random seed passed to
synthesize().
References
Jordon J, et al. (2022). Synthetic Data – what, why and how? arXiv preprint arXiv:2205.03257. doi:10.48550/arXiv.2205.03257
Examples
set.seed(42)
real <- data.frame(x = rnorm(100), y = rnorm(100))
compare_methods(real, seed = 42)
#>
#> ── Synthesis method comparison
#> # A tibble: 12 × 4
#> method metric value interpretation
#> * <chr> <chr> <dbl> <chr>
#> 1 parametric ks_statistic_mean 0.085 Good fidelity
#> 2 parametric correlation_diff 0.0263 Excellent
#> 3 parametric discriminative_auc 0.532 Indistinguishable
#> 4 parametric nn_distance_ratio 0.889 Moderate risk
#> 5 bootstrap ks_statistic_mean 0.08 Good fidelity
#> 6 bootstrap correlation_diff 0.122 Poor
#> 7 bootstrap discriminative_auc 0.509 Indistinguishable
#> 8 bootstrap nn_distance_ratio 0.562 Moderate risk
#> 9 noise ks_statistic_mean 0.085 Good fidelity
#> 10 noise correlation_diff 0.120 Poor
#> 11 noise discriminative_auc 0.502 Indistinguishable
#> 12 noise nn_distance_ratio 0.622 Moderate risk