Compute the conditional effects in a moderated regression model.
Usage
cond_effect(
output,
x = NULL,
w = NULL,
w_method = c("sd", "percentile"),
w_percentiles = c(0.16, 0.5, 0.84),
w_sd_to_percentiles = NA,
w_from_mean_in_sd = 1,
w_values = NULL
)
cond_effect_boot(
output,
x = NULL,
w = NULL,
...,
conf = 0.95,
nboot = 100,
boot_args = NULL,
save_boot_est = TRUE,
full_output = FALSE,
do_boot = TRUE
)
Arguments
- output
The output from
stats::lm()
. It can also accept the output fromstd_selected()
orstd_selected_boot()
.- x
The focal variable (independent variable), that is, the variable with its effect on the outcome variable (dependent) being moderated. It must be a numeric variable.
- w
The moderator. Both numeric variables and categorical variables (character or factor) are supported.
- w_method
How to define "low", "medium", and "high" for the moderator levels. Default is in terms of mean and standard deviation (SD) of the moderator,
"sd"
: "low", "medium", and "high" are one SD below mean, mean, and one SD above mean, respectively. If equal to"percentile"
, then percentiles of the moderator in the dataset are used: "low", "medium", and "high" are 16th, 50th (median), and 84th percentiles, respectively. Ignored ifw
is categorical.- w_percentiles
If
w_method
is"percentile"
, then this argument specifies the three percentiles to be used, divided by 100. It must be a vector of two numbers. The default isc(.16, .50, .84)
, the 16th, 50th, and 84th percentiles, which corresponds approximately to one SD below and above mean in a normal distribution, respectively. Ignored ifw
is categorical.- w_sd_to_percentiles
If
w_method
is"percentile"
and this argument is set to a number, this number will be used to to determine the percentiles to be used. The lower percentile is the percentile in a normal distribution that isw_sd_to_percentiles
SD below the mean. The upper percentile is the percentile in a normal distribution that isw_sd_to_percentiles
SD above the mean. Therefore, ifw_sd_to_percentiles
is set to 1, then the lower and upper percentiles are 16th and 84th, respectively. Default isNA
.- w_from_mean_in_sd
How many SD from mean is used to define "low" and "high" for the moderator. Default is 1. Ignored if
w
is categorical.- w_values
The values of
w
to be used. Default isNULL
. If a numeric vector is supplied, these values will be used to compute the conditional effects. Other arguments on generating levels are ignored. Note that, ifw
has been standardized or centered, these values are for the standardized or centeredw
. The values will always be sorted. This argument is ignored ifw
is categorical.- ...
Arguments to be passed to
cond_effect()
.- conf
The level of confidence for the confidence interval. Default is .95, to get 95% confidence intervals.
- nboot
The number of bootstrap samples. Default is 100.
- boot_args
A named list of arguments to be passed to
boot::boot()
. Default isNULL
.- save_boot_est
If
TRUE
, the default, the bootstrap estimates will be saved in the elementboot_est
of the output.- full_output
Whether the full output from
boot::boot()
will be returned. Default isFALSE
. IfTRUE
, the full output fromboot::boot()
will be saved in the elementboot_out
of the output.- do_boot
Whether bootstrapping confidence intervals will be formed. Default is
TRUE
. IfFALSE
, all arguments related to bootstrapping will be ignored.
Value
cond_effect()
returns a data-frame-like object of the conditional effects.
The class is
cond_effect
and the print method will print additional information of
the conditional effects. Additional information is stored in the
following attributes:
call
: The original call.output
: Theoutput
object, such as the output fromlm()
.x
,y
, andw
: The three variables used to compute the conditional effects: focal variable (x
), outcome variable (y
), and the moderator (w
).w_method
: The method used to determine the values of the moderator at the selected levels.w_percentiles
The percentiles to use ifw_method
="percentile"
.w_sd_to_percentiles
: If not equal toNA
, this is a scalar, the number of standard deviation from the mean used to determine the percentiles for the "low" and "high" levels of the moderator.w_from_mean_in_sd
: The number of SD above or below the mean, for determining the "low" and "high" levels of the moderator ifw_method
is"sd"
.w_empirical_percentiles
: The actual percentile levels in the dataset for the selected levels of the moderator. A numeric vector.w_empirical_z
: The actual distance from the mean, in SD, of each selected level of the moderator. A numeric vector.y_standardized
,x_standardized
, andw_standardized
: Each of them is a logical scalar, indicating whether the outcome variable, focal variable, and moderator are standardized.
cond_effect_boot()
also returns a data-frame-like object of the
conditional effects of the class
cond_effect
, with additional information from the bootstrapping stored
in these attributes:
boot_ci
: A data frame of the bootstrap confidence intervals of the conditional effects.nboot
: The number of bootstrap samples requested.conf
: The level of confidence, in proportion.boot_est
: A matrix of the bootstrap estimates of the conditional effects. The number of rows equal tonboot
, and the number of columns equal to the number of levels of the moderator.cond_effect_boot_call
: The call tocond_effect_boot()
.boot_out
: If available, the original output fromboot::boot()
.
Details
cond_effect()
uses the centering approach to find the conditional
effect of the focal variable. For each level of the moderator, the value for
this level is subtracted from the moderator scores, and the model is
fitted to the modified data.
The coefficient of the focal variable is then the conditional effect of the
focal variable when the moderator's score is equal this value.
cond_effect_boot()
function is a wrapper of cond_effect()
.
It calls cond_effect()
once for each bootstrap sample, and then computes the nonparametric
bootstrap percentile confidence intervals (Cheung, Cheung, Lau, Hui,
& Vong, 2022). If the output object is the output of std_selected()
or std_selected_boot()
, in which mean-centering and/or standardization
have been conducted, they will be repeated in each bootstrap sample.
Therefore, like std_selected_boot()
, it can be used for form
nonparametric bootstrap confidence intervals for standardized
effects, though cond_effect_boot()
does this for the standardized
conditional effects.
This function ignores bootstrapping done by std_selected_boot()
. It will
do its own bootstrapping.
If do_boot
is FALSE
, then the object it returns is identical to that
by cond_effect()
.
This function intentionally does not have an argument for setting the seed
for
random number. Users are recommended to set the seed, e.g., using
set.seed()
before calling it, to ensure reproducibility.
Functions
cond_effect_boot()
: A wrapper ofcond_effect()
that forms nonparametric bootstrap confidence intervals.
Author
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448
Examples
# Load a sample data set
dat <- test_x_1_w_1_v_1_cat1_n_500
# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)
#>
#> Call:
#> lm(formula = dv ~ iv * mod + v1 + cat1, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2146.0 -431.9 -25.0 411.2 2309.3
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 308.767 4075.066 0.076 0.9396
#> iv 52.760 271.242 0.195 0.8459
#> mod 5.127 40.772 0.126 0.9000
#> v1 -12.760 10.174 -1.254 0.2104
#> cat1gp2 -158.673 71.834 -2.209 0.0276 *
#> cat1gp3 -43.166 75.283 -0.573 0.5666
#> iv:mod 3.416 2.709 1.261 0.2080
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 665 on 493 degrees of freedom
#> Multiple R-squared: 0.6352, Adjusted R-squared: 0.6307
#> F-statistic: 143 on 6 and 493 DF, p-value: < 2.2e-16
#>
cond_effect(lm_raw, x = iv, w = mod)
#> The effects of iv on dv, conditional on mod:
#>
#> Level mod iv Effect S.E. t p Sig
#> High 105.436 412.911 20.827 19.826 0.000 ***
#> Medium 100.395 395.693 14.684 26.948 0.000 ***
#> Low 95.354 378.474 19.249 19.662 0.000 ***
#>
#>
#> The regression model:
#>
#> dv ~ iv * mod + v1 + cat1
#>
#> Interpreting the levels of mod:
#>
#> Level mod % Below From Mean (in SD)
#> High 105.436 84.00 1.00
#> Medium 100.395 47.40 0.00
#> Low 95.354 17.20 -1.00
#>
#> - % Below: The percent of cases equal to or less than a level.
#> - From Mean (in SD): Distance of a level from the mean, in standard
#> deviation (+ve above, -ve below).
lm_std <- std_selected(lm_raw, to_standardize = ~ iv + mod)
cond_effect(lm_std, x = iv, w = mod)
#> The effects of iv on dv, conditional on mod:
#>
#> Level mod iv Effect S.E. t p Sig
#> High 1.000 841.990 42.468 19.826 0.000 ***
#> Medium 0.000 806.878 29.942 26.948 0.000 ***
#> Low -1.000 771.767 39.251 19.662 0.000 ***
#>
#>
#> The regression model:
#>
#> dv ~ iv * mod + v1 + cat1
#>
#> Interpreting the levels of mod:
#>
#> Level mod % Below From Mean (in SD)
#> High 1.000 84.00 1.00
#> Medium 0.000 47.40 0.00
#> Low -1.000 17.20 -1.00
#>
#> - % Below: The percent of cases equal to or less than a level.
#> - From Mean (in SD): Distance of a level from the mean, in standard
#> deviation (+ve above, -ve below).
#>
#> Note:
#>
#> - The variable(s) iv, mod is/are standardized.
#> - One or more variables are scaled by SD or standardized. OLS standard
#> errors and confidence intervals may be biased for their coefficients.
#> Please use `cond_effect_boot()`.
# Categorical moderator
lm_cat <- lm(dv ~ iv*cat1 + v1, dat)
summary(lm_cat)
#>
#> Call:
#> lm(formula = dv ~ iv * cat1 + v1, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2457.67 -506.03 3.46 437.95 2738.18
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 979.20 459.19 2.132 0.0335 *
#> iv 391.03 29.25 13.370 <2e-16 ***
#> cat1gp2 -845.49 584.85 -1.446 0.1489
#> cat1gp3 259.55 620.76 0.418 0.6760
#> v1 -19.36 11.00 -1.759 0.0791 .
#> iv:cat1gp2 43.28 38.44 1.126 0.2608
#> iv:cat1gp3 -21.22 41.08 -0.516 0.6058
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 721.3 on 493 degrees of freedom
#> Multiple R-squared: 0.5707, Adjusted R-squared: 0.5655
#> F-statistic: 109.2 on 6 and 493 DF, p-value: < 2.2e-16
#>
cond_effect(lm_cat, x = iv, w = cat1)
#> The effects of iv on dv, conditional on cat1:
#>
#> Level cat1 iv Effect S.E. t p Sig
#> gp1 gp1 391.026 29.246 13.370 0.000 ***
#> gp2 gp2 434.302 24.937 17.416 0.000 ***
#> gp3 gp3 369.807 28.858 12.815 0.000 ***
#>
#>
#> The regression model:
#>
#> dv ~ iv * cat1 + v1
# Load a sample data set
dat <- test_x_1_w_1_v_1_cat1_n_500
# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)
#>
#> Call:
#> lm(formula = dv ~ iv * mod + v1 + cat1, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2146.0 -431.9 -25.0 411.2 2309.3
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 308.767 4075.066 0.076 0.9396
#> iv 52.760 271.242 0.195 0.8459
#> mod 5.127 40.772 0.126 0.9000
#> v1 -12.760 10.174 -1.254 0.2104
#> cat1gp2 -158.673 71.834 -2.209 0.0276 *
#> cat1gp3 -43.166 75.283 -0.573 0.5666
#> iv:mod 3.416 2.709 1.261 0.2080
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 665 on 493 degrees of freedom
#> Multiple R-squared: 0.6352, Adjusted R-squared: 0.6307
#> F-statistic: 143 on 6 and 493 DF, p-value: < 2.2e-16
#>
lm_std <- std_selected(lm_raw, to_standardize = ~ iv + mod)
cond_effect(lm_std, x = iv, w = mod)
#> The effects of iv on dv, conditional on mod:
#>
#> Level mod iv Effect S.E. t p Sig
#> High 1.000 841.990 42.468 19.826 0.000 ***
#> Medium 0.000 806.878 29.942 26.948 0.000 ***
#> Low -1.000 771.767 39.251 19.662 0.000 ***
#>
#>
#> The regression model:
#>
#> dv ~ iv * mod + v1 + cat1
#>
#> Interpreting the levels of mod:
#>
#> Level mod % Below From Mean (in SD)
#> High 1.000 84.00 1.00
#> Medium 0.000 47.40 0.00
#> Low -1.000 17.20 -1.00
#>
#> - % Below: The percent of cases equal to or less than a level.
#> - From Mean (in SD): Distance of a level from the mean, in standard
#> deviation (+ve above, -ve below).
#>
#> Note:
#>
#> - The variable(s) iv, mod is/are standardized.
#> - One or more variables are scaled by SD or standardized. OLS standard
#> errors and confidence intervals may be biased for their coefficients.
#> Please use `cond_effect_boot()`.
# Form nonparametric bootstrap confidence intervals
# Use 2000 or even 5000 for nboot in real research
out <- cond_effect_boot(lm_std, x = iv, w = mod, nboot = 50)
out
#> The effects of iv on dv, conditional on mod:
#>
#> Level mod iv Effect CI Lower CI Upper S.E. t p Sig
#> High 1.000 841.990 745.556 969.398 42.468 19.826 0.000 ***
#> Medium 0.000 806.878 731.389 882.734 29.942 26.948 0.000 ***
#> Low -1.000 771.767 689.364 835.260 39.251 19.662 0.000 ***
#>
#> [CI Lower, CI Upper] shows the 95% nonparametric bootstrap confidence
#> interval(s) (based on 50 bootstrap samples).
#>
#>
#> The regression model:
#>
#> dv ~ iv * mod + v1 + cat1
#>
#> Interpreting the levels of mod:
#>
#> Level mod % Below From Mean (in SD)
#> High 1.000 84.00 1.00
#> Medium 0.000 47.40 0.00
#> Low -1.000 17.20 -1.00
#>
#> - % Below: The percent of cases equal to or less than a level.
#> - From Mean (in SD): Distance of a level from the mean, in standard
#> deviation (+ve above, -ve below).
#>
#> Note:
#>
#> - The variable(s) iv, mod is/are standardized.