Gets a lavaan_rerun()
or lavaan::lavaan()
output
and computes the Mahalanobis distance for each case using only the
observed predictors.
Usage
mahalanobis_predictors(
fit,
emNorm_arg = list(estimate.worst = FALSE, criterion = 1e-06)
)
Arguments
- fit
It can be the output from
lavaan
, such aslavaan::cfa()
andlavaan::sem()
, or the output fromlavaan_rerun()
.- emNorm_arg
No longer used. Kept for backward compatibility.
Value
A md_semfindr
-class object, which is
a one-column matrix (a column vector) of the Mahalanobis
distance for each case. The number of rows equals to the number of
cases in the data stored in the fit object.
A print method is available for user-friendly output.
Details
For each case, mahalanobis_predictors()
computes the
Mahalanobis distance of each case on the observed predictors.
If there are no missing values, stats::mahalanobis()
will be used
to compute the Mahalanobis distance.
If there are missing values on the observed predictors, the means
and variance-covariance matrices will be estimated by maximum
likelihood using lavaan::lavCor()
. The estimates will be passed
to modi::MDmiss()
to compute the Mahalanobis distance.
Supports both single-group and multiple-group models. For multiple-group models, the Mahalanobis distance for each case is computed using the means and covariance matrix of the group this case belongs to. (Support for multiple-group models available in 0.1.4.8 and later version).
References
Béguin, C., & Hulliger, B. (2004). Multivariate outlier detection in incomplete survey data: The epidemic algorithm and transformed rank correlations. Journal of the Royal Statistical Society: Series A (Statistics in Society), 167(2), 275-294.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 2, 49-55.
Schafer, J.L. (1997) Analysis of incomplete multivariate data. Chapman & Hall/CRC Press.
Author
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Examples
library(lavaan)
dat <- pa_dat
# For illustration, select only the first 50 cases.
dat <- dat[1:50, ]
# The model
mod <-
"
m1 ~ a1 * iv1 + a2 * iv2
dv ~ b * m1
a1b := a1 * b
a2b := a2 * b
"
# Fit the model
fit <- lavaan::sem(mod, dat)
summary(fit)
#> lavaan 0.6.17 ended normally after 1 iteration
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 5
#>
#> Number of observations 50
#>
#> Model Test User Model:
#>
#> Test statistic 1.768
#> Degrees of freedom 2
#> P-value (Chi-square) 0.413
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|)
#> m1 ~
#> iv1 (a1) -0.159 0.166 -0.954 0.340
#> iv2 (a2) 0.525 0.162 3.241 0.001
#> dv ~
#> m1 (b) 0.350 0.161 2.169 0.030
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .m1 0.901 0.180 5.000 0.000
#> .dv 1.423 0.285 5.000 0.000
#>
#> Defined Parameters:
#> Estimate Std.Err z-value P(>|z|)
#> a1b -0.056 0.064 -0.873 0.382
#> a2b 0.184 0.102 1.803 0.071
#>
md_predictors <- mahalanobis_predictors(fit)
md_predictors
#>
#> -- Mahalanobis Distance --
#>
#> md
#> 13 7.179
#> 45 6.707
#> 50 6.297
#> 33 5.479
#> 43 5.115
#> 25 4.909
#> 27 4.685
#> 20 4.378
#> 32 4.157
#> 34 3.432
#>
#> Note:
#> - Only the first 10 case(s) is/are displayed. Set ‘first’ to NULL to display all cases.
#> - Cases sorted by Mahalanobis distance in decreasing order.
#> - Mahalanobis distance computed only on predictors.