Skip to contents

Based on computed area under the curves (i.e., egg_aucs()) and slopes (i.e., egg_slopes()) for several intervals using a model fitted by egg_model(), compute an outlier detection. For details, see methods iqr and zscore of performance::check_outliers().

Usage

egg_outliers(
  fit,
  period = c(0, 0.5, 1.5, 3.5, 6.5, 10, 12, 17),
  knots = c(1, 8, 12),
  from = c("predicted", "observed"),
  start = 0.25,
  end = 10,
  step = 0.01,
  filter = NULL,
  outlier_method = "iqr",
  outlier_threshold = list(iqr = 2)
)

Arguments

fit

A model object from a statistical model such as from a call to egg_model().

period

The intervals knots on which slopes are to be computed.

knots

The knots as defined fit and according to method.

from

A string indicating the type of data to be used for the AP and AR computation, either "predicted" or "observed". Default is "predicted".

start

The start of the time window to compute AP and AR.

end

The end of the time window to compute AP and AR.

step

The step to increment the sequence.

filter

A string following data.table syntax for filtering on "i" (i.e., row elements), e.g., filter = "source == 'A'". Argument pass through compute_apar() (see predict_bmi()). Default is NULL.

outlier_method

The outlier detection method(s). Default is "iqr". Can be "cook", "pareto", "zscore", "zscore_robust", "iqr", "ci", "eti", "hdi", "bci", "mahalanobis", "mahalanobis_robust", "mcd", "ics", "optics" or "lof". See performance::check_outliers() https://easystats.github.io/performance/reference/check_outliers.html for details.

outlier_threshold

A list containing the threshold values for each method (e.g., list('mahalanobis' = 7, 'cook' = 1)), above which an observation is considered as outlier. If NULL, default values will be used (see 'Details'). If a numeric value is given, it will be used as the threshold for any of the method run. See performance::check_outliers() https://easystats.github.io/performance/reference/check_outliers.html for details.

Value

A data.frame listing the individuals which are not outliers based on several criteria.

Examples

data("bmigrowth")
res <- egg_model(
  formula = log(bmi) ~ age,
  data = bmigrowth[bmigrowth[["sex"]] == 0, ],
  id_var = "ID",
  random_complexity = 1
)
#> Fitting model:
#>   nlme::lme(
#>     fixed = log(bmi) ~ gsp(age, knots = c(1, 8, 12), degree = rep(3, 4), smooth = rep(2, 3)),
#>     data = data,
#>     random = ~ gsp(age, knots = c(1, 8, 12), degree = rep(1, 4), smooth = rep(2, 3)) | ID,
#>     na.action = stats::na.omit,
#>     method = "ML",
#>     control = nlme::lmeControl(opt = "optim", niterEM = 25, maxIter = 500, msMaxIter = 500)
#>   )
head(egg_outliers(
  fit = res,
  period = c(0, 0.5, 1.5, 3.5, 6.5, 10, 12, 17),
  knots = c(1, 8, 12)
)[Outlier != 0])
#>      parameter     ID   Row Distance_IQR Outlier_IQR Outlier
#>         <char> <char> <int>        <num>       <num>   <num>
#> 1:  auc_0--0.5    048    25     1.577435           1       1
#> 2:  auc_0--0.5    049    26     1.363502           1       1
#> 3: AR_ageyears    081    39     1.280303           1       1
#> 4:      AP_bmi    001     1     1.295708           1       1
#> 5:      AP_bmi    033    16     1.375866           1       1
#> 6:      AP_bmi    044    21     1.791524           1       1