BIC Posterior Probabilities of Neighboring Models

Identify neighboring models, fit them, and return the BIC posterior probabilities.

Usage

model_set(
  sem_out,
  partables = NULL,
  model_set_out = NULL,
  prior_sem_out = NULL,
  must_add = NULL,
  must_not_add = NULL,
  must_drop = NULL,
  must_not_drop = NULL,
  remove_constraints = TRUE,
  exclude_error_cov = TRUE,
  exclude_feedback = TRUE,
  exclude_xy_cov = TRUE,
  df_change_add = 1,
  df_change_drop = 1,
  remove_duplicated = TRUE,
  fit_models = ifelse(!is.null(model_set_out$fit), FALSE, TRUE),
  compute_bpp = TRUE,
  original = "original",
  parallel = FALSE,
  ncores = max(parallel::detectCores(logical = FALSE) - 1, 1),
  make_cluster_args = list(),
  progress = TRUE,
  verbose = TRUE,
  skip_check_sem_out = FALSE,
  drop_equivalent_models = TRUE
)

gen_models(
  sem_out,
  must_add = NULL,
  must_not_add = NULL,
  must_drop = NULL,
  must_not_drop = NULL,
  remove_constraints = TRUE,
  exclude_error_cov = TRUE,
  df_change_add = 1,
  df_change_drop = 1,
  remove_duplicated = TRUE,
  progress = TRUE,
  output = c("partables", "model_set")
)

Arguments

sem_out: It can be the output from an SEM function. Currently it supports lavaan::lavaan objects only. If it is a named list of lavaan::lavaan objects, then all arguments for model generation will be ignored, and models will not be refitted. Users need to ensure that the models can be meaningfully compared because they will not be checked.
partables: A partables-class object, usually generated by get_add() or get_drop(). A named list of parameter tables to be fitted along with the original model in sem_out. If supplied, all arguments related to identifying models will be ignored. Default is NULL.
model_set_out: If set to the output of a previous call to model_set() (a model_set-class object), the list of stored models will be used. All arguments related to generate neighboring models will be ignored. If supplied, sem_out will also be ignored and will be retrieved from model_set_out, and partables will also be ignored. Default is NULL.
prior_sem_out: The prior of the model fitted in sem_out. Default is NULL, and all models will have equal prior probabilities.
must_add: A character vector of parameters, named in lavaan::lavaan() style (e.g., "y ~ x"), that must be added. Default is `NULL“.
must_not_add: A character vector of parameters, named in lavaan::lavaan() style (e.g., "x1 ~~ x1"), that must not be added. Default is NULL.
must_drop: A character vector of parameters, named in lavaan::lavaan() style (e.g., "y ~ x"), that must be included. Default is NULL.
must_not_drop: A character vector of parameters, named in lavaan::lavaan() style (e.g., "x1 ~~ x1"), that must not be included. Default is NULL.
remove_constraints: Whether equality constraints will be removed. Default is “TRUE`.
exclude_error_cov: Exclude error covariances of indicators. Default is TRUE.
exclude_feedback: Exclude paths that will result in a feedback loop. For example, if there is path from x through m to y, then the path x ~ y will create a feedback loop. Default has been changed to TRUE since Version 0.1.3.5 because feedback loops are usually not included except when theoretically justified. To reproduce results based on previous version, set this argument to FALSE.
exclude_xy_cov: Exclude covariance between two variables, in which one has a path to another. For example, if there is path from x through m to y, then the covariance x ~~ y, which denotes the covariance between x and the error term of y, will be excluded if this argument is TRUE. Default has been changed to TRUE since Version 0.1.3.5 because these covariances rarely are interpretable. To reproduce results based on previous version, set this argument to FALSE.
df_change_add: How many degrees of freedom (df) away in the list. All models with df change less than or equal to this number will be included, taking into account requirements set by other arguments. Default is 1.
df_change_drop: How many degrees of freedom away in the list. All models with df change less than or equal to this number will be included, taking into account requirements set by other arguments. Default is 1.
remove_duplicated: If TRUE, the default, duplicated models are removed.
fit_models: If TRUE, the models will be fitted to the data, usually stored in sem_out. If FALSE, the models will be returned as is, in the element models of the output. If model_set_out is set and models have been fitted, then default is FALSE. Otherwise, default is TRUE.
compute_bpp: If TRUE, then BIC posterior probabilities will be computed. Default is TRUE.
original: String. The name of the original (traget) model. Default is "original". Used if prior_sem_out is unnamed and only has one value.
parallel: If TRUE, parallel processing will be used to fit the models. Default is FALSE.
ncores: Numeric. The number of CPU cores to be used if parallel is TRUE.
make_cluster_args: A list of named arguments to be passed to parallel::makeCluster(). Used by advanced users to configure the cluster if parallel is TRUE. Default is list().
progress: Whether a progress bar will be displayed, implemented by the pbapply package or by utils::txtProgressBar. Default is TRUE.
verbose: Whether additional messages will be displayed, such as the expected processing time. Default is TRUE.
skip_check_sem_out: If TRUE and sem_out is set, check whether sem_out is of a supported type (estimator is "ML" and the model has only one group). If not, an error will be raised. Can be set to FALSE for experimenting the functions on models not officially supported.
drop_equivalent_models: If TRUE, the default, equivalent models will be dropped in the final output. This check can only be conducted when no models are fitted in lavaan::lavaan() with fixed.x = TRUE (which is the default of lavaan::sem()).
output: If "model_set", then the output is a model_set-class object. If "partables", the output is a partables-class object. Default is partables.

Value

The function model_set() returns an object of the class model_set, a list with the following major elements:

models: A named list of parameter tables. Each represent the models identified.
bic: A numeric vector, of the same length as model. The BIC of each model.
postprob: A numeric vector, of the same length as model. The BIC posterior probability of each model.
fit: A named list of lavaan::lavaan() output objects or update() for fitting a model with the added parameters, of the same length as model.
change: A numeric vector, of the same length as model. The change in model df for each fit. A positive number denotes one less free parameter. A negative number denotes one more free parameter or one less constraint.
converged: A named vector of boolean values, of the same length as model. Indicates whether each fit converged or not.
post_check: A named vector of boolean values, of the same length as model. Indicates whether the solution of each fit is admissible or not. Checked by lavaan::lavInspect().

The object returned by gen_models() depends on the argument output. See the argument output for the details

Details

It computes the BIC posterior probabilities of a set of models by the method presented in Wu, Cheung, and Leung (2020).

First, a list of model is identified based on user-specified criteria. By default, models differ from a fitted model by one degree of freedom, the 1-df-away neighboring models, will be found using get_add() and get_drop.

Second, these models will be fitted to the sample dataset, and their BICs will be computed.

Third, their BIC posterior probabilities will be computed using their BICs. By default, equal prior probabilities for all the models being fitted will be assumed in the current version. This can be changed by prior_sem_out.

The results can then be printed, with the models sorted by descending order of BIC posterior probabilities. The results can also be visualized using model_graph().

Functions

model_set(): Compute the BPPs of a list of models. Can generate the models and/or fit the models. Can also accept pregenerated models, or just update BPPs.
gen_models(): Generate a list of models (parameter tables).

References

Wu, H., Cheung, S. F., & Leung, S. O. (2020). Simple use of BIC to assess model selection uncertainty: An illustration using mediation and moderation models. Multivariate Behavioral Research, 55(1), 1–16. doi:10.1080/00273171.2019.1574546

Author

Shu Fai Cheung https://orcid.org/0000-0002-9871-9448

Examples


library(lavaan)

dat <- dat_path_model

mod <-
"
x3 ~ a*x1 + b*x2
x4 ~ a*x1
ab := a*b
"

fit <- sem(mod, dat_path_model, fixed.x = TRUE)

out <- model_set(fit)
#> 
#> Generate 2 less restrictive model(s):
#> 
  |                                                  | 0 % ~calculating  
  |+++++++++++++++++++++++++                         | 50% ~00s          
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
#> 
#> Generate 2 more restrictive model(s):
#> 
  |                                                  | 0 % ~calculating  
  |+++++++++++++++++++++++++                         | 50% ~00s          
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
#> 
#> Check for duplicated models (5 model[s] to check):
#> 
  |                                                        
  |                                                  |   0%
  |                                                        
  |+++++                                             |  10%
  |                                                        
  |++++++++++                                        |  20%
  |                                                        
  |+++++++++++++++                                   |  30%
  |                                                        
  |++++++++++++++++++++                              |  40%
  |                                                        
  |+++++++++++++++++++++++++                         |  50%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  60%
  |                                                        
  |+++++++++++++++++++++++++++++++++++               |  70%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  80%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++     |  90%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
#> 
#> Fit the 5 model(s) (duplicated models removed):
out
#> 
#> Call:
#> model_set(sem_out = fit)
#> 
#> Number of model(s) fitted           : 5
#> Number of model(s) converged        : 5
#> Number of model(s) passed post.check: 5
#> 
#> The models (sorted by BPP):
#>                      model_df df_diff Prior     BIC   BPP   cfi rmsea  srmr
#> add: x4~x2                  1       1 0.200 400.291 1.000 1.000 0.017 0.023
#> original                    2       0 0.200 431.452 0.000 0.736 0.417 0.194
#> add: (x3~x1),(x4~x1)        1       1 0.200 435.397 0.000 0.733 0.593 0.193
#> drop: x3~~x4                3      -1 0.200 441.229 0.000 0.634 0.401 0.231
#> drop: x3~x2                 3      -1 0.200 455.926 0.000 0.522 0.458 0.255
#> 
#> Note:
#> - BIC: Bayesian Information Criterion.
#> - BPP: BIC posterior probability.
#> - model_df: Model degrees of freedom.
#> - df_diff: Difference in df compared to the original/target model.
#> - To show cumulative BPPs, call print() with 'cumulative_bpp = TRUE'.
#> - At least one model has fixed.x = TRUE. The models are not checked for
#>   equivalence.
#> - Since Version 0.1.3.5, the default values of exclude_feedback and
#>   exclude_xy_cov changed to TRUE. Set them to FALSE to reproduce
#>   results from previous versions.