Title: | Influential Cases in Structural Equation Modeling |
---|---|
Description: | Sensitivity analysis in structural equation modeling using influence measures and diagnostic plots. Support leave-one-out casewise sensitivity analysis presented by Pek and MacCallum (2011) <doi:10.1080/00273171.2011.561068> and approximate casewise influence using scores and casewise likelihood. |
Authors: | Shu Fai Cheung [aut, cre] |
Maintainer: | Shu Fai Cheung <[email protected]> |
License: | GPL-3 |
Version: | 0.1.8.6 |
Built: | 2025-01-25 08:24:03 UTC |
Source: | https://github.com/sfcheung/semfindr |
Gets a 'lavaan' output and checks whether it is supported by the functions using the approximate approach.
approx_check( fit, print_messages = TRUE, multiple_group = TRUE, equality_constraint = TRUE )
approx_check( fit, print_messages = TRUE, multiple_group = TRUE, equality_constraint = TRUE )
fit |
The output from |
print_messages |
Logical. If |
multiple_group |
Logical. Whether multiple-group models are
supported. If yes, the check for multiple-groups models will be
skipped. Default is |
equality_constraint |
Logical. Whether models with
equality constraints are
supported. If yes, the check for equality constraints will be
skipped. Default is |
This function is not supposed to be used by users. It is
called by functions such as est_change_approx()
to see if the
analysis passed to
it is supported. If not, messages will be printed to indicate why.
A single-element vector. If confirmed to be supported, will return 0. If not confirmed to be support but may still work, return 1. If confirmed to be not yet supported, will return a negative number, the value of this number without the negative sign is the number of tests failed.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
dat <- cfa_dat mod <- " f1 =~ x4 + x5 + x6 " dat_gp <- dat dat$gp <- rep(c("gp1", "gp2"), length.out = nrow(dat_gp)) fit01 <- lavaan::sem(mod, dat) # If supported, returns a zero approx_check(fit01) fit05 <- lavaan::cfa(mod, dat, group = "gp") # If not supported, returns a negative number approx_check(fit05)
dat <- cfa_dat mod <- " f1 =~ x4 + x5 + x6 " dat_gp <- dat dat$gp <- rep(c("gp1", "gp2"), length.out = nrow(dat_gp)) fit01 <- lavaan::sem(mod, dat) # If supported, returns a zero approx_check(fit01) fit05 <- lavaan::cfa(mod, dat, group = "gp") # If not supported, returns a negative number approx_check(fit05)
A six-variable dataset with 100 cases.
cfa_dat
cfa_dat
A data frame with 100 rows and 6 variables:
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
library(lavaan) data(cfa_dat) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit <- cfa(mod, cfa_dat) summary(fit)
library(lavaan) data(cfa_dat) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit <- cfa(mod, cfa_dat) summary(fit)
A six-variable dataset with 60 cases, with one case resulting in negative variance if not removed.
cfa_dat_heywood
cfa_dat_heywood
A data frame with 60 rows and 6 variables:
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
library(lavaan) data(cfa_dat_heywood) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " # The following will result in a warning fit <- cfa(mod, cfa_dat_heywood) # One variance is negative parameterEstimates(fit, output = "text") # Fit the model with the first case removed fit_no_case_1 <- cfa(mod, cfa_dat_heywood[-1, ]) # Results admissible parameterEstimates(fit_no_case_1, output = "text")
library(lavaan) data(cfa_dat_heywood) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " # The following will result in a warning fit <- cfa(mod, cfa_dat_heywood) # One variance is negative parameterEstimates(fit, output = "text") # Fit the model with the first case removed fit_no_case_1 <- cfa(mod, cfa_dat_heywood[-1, ]) # Results admissible parameterEstimates(fit_no_case_1, output = "text")
A six-variable dataset with 100 cases.
cfa_dat_mg
cfa_dat_mg
A data frame with 100 rows and 6 variables:
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Group variable. Character. Either "GroupA" or "GroupB".
library(lavaan) data(cfa_dat_mg) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit1 <- cfa(mod, cfa_dat_mg, group = "gp") fit2 <- cfa(mod, cfa_dat_mg, group = "gp", group.equal = "loadings") fit3 <- cfa(mod, cfa_dat_mg, group = "gp", group.equal = c("loadings", "intercepts")) lavTestLRT(fit1, fit2, fit3) lavTestLRT(fit1, fit3) # Drop the first case cfa_dat_mgb <- cfa_dat_mg[-1, ] fit1b <- cfa(mod, cfa_dat_mgb, group = "gp") fit2b <- cfa(mod, cfa_dat_mgb, group = "gp", group.equal = "loadings") fit3b <- cfa(mod, cfa_dat_mgb, group = "gp", group.equal = c("loadings", "intercepts")) lavTestLRT(fit1b, fit2b, fit3b) lavTestLRT(fit1b, fit3b)
library(lavaan) data(cfa_dat_mg) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit1 <- cfa(mod, cfa_dat_mg, group = "gp") fit2 <- cfa(mod, cfa_dat_mg, group = "gp", group.equal = "loadings") fit3 <- cfa(mod, cfa_dat_mg, group = "gp", group.equal = c("loadings", "intercepts")) lavTestLRT(fit1, fit2, fit3) lavTestLRT(fit1, fit3) # Drop the first case cfa_dat_mgb <- cfa_dat_mg[-1, ] fit1b <- cfa(mod, cfa_dat_mgb, group = "gp") fit2b <- cfa(mod, cfa_dat_mgb, group = "gp", group.equal = "loadings") fit3b <- cfa(mod, cfa_dat_mgb, group = "gp", group.equal = c("loadings", "intercepts")) lavTestLRT(fit1b, fit2b, fit3b) lavTestLRT(fit1b, fit3b)
A six-variable dataset with 100 cases, with one influential case.
cfa_dat2
cfa_dat2
A data frame with 100 rows and 7 variables:
Case ID. Character.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
library(lavaan) data(cfa_dat2) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit <- cfa(mod, cfa_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
library(lavaan) data(cfa_dat2) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 " fit <- cfa(mod, cfa_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
Gets a lavaan_rerun()
output and computes the
standardized changes in selected parameters for each case
if included.
est_change(rerun_out, parameters = NULL)
est_change(rerun_out, parameters = NULL)
rerun_out |
The output from |
parameters |
A character vector to specify the selected
parameters. Each parameter is named as in |
For each case, est_change()
computes the differences in
the estimates of selected parameters with and without this case:
(Estimate with all case) - (Estimate without this case).
The differences are standardized by dividing the raw differences by their standard errors (Pek & MacCallum, 2011). This is a measure of the standardized influence of a case on the parameter estimates if it is included.
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
If the analysis is not admissible or does not converge when a case
is deleted, NA
s will be turned for this case on the differences.
Unlike est_change_raw()
, est_change()
does not support
computing the standardized changes of standardized estimates.
It will also compute generalized Cook's distance (gCD), proposed by Pek and MacCallum (2011) for structural equation modeling. Only the parameters selected (all free parameters, by default) will be used in computing gCD.
Since version 0.1.4.8, if (a) a model has one or more equality constraints, and (b) some selected parameters are linearly dependent or constrained to be equal due to the constraint(s), gCD will be computed by removing parameters such that the remaining parameters are not linearly dependent nor constrained to be equal. (Support for equality constraints and linearly dependent parameters available in 0.1.4.8 and later version).
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An est_change
-class object, which is
matrix with the number of columns equals to the number of
requested parameters plus one, the last column being the
generalized Cook's
distance. The number of rows equal to the number
of cases. The row names are the case identification values used in
lavaan_rerun()
. The elements are the standardized difference.
Please see Pek and MacCallum (2011), Equation 7.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 4, 7, 9)) # Compute the standardized changes in parameter estimates # if a case is included vs. if this case is excluded. # That is, case influence on parameter estimates, standardized. out <- est_change(fit_rerun) # Case influence: out # Note that these are the differences divided by the standard errors # The rightmost column, `gcd`, contains the # generalized Cook's distances (Pek & MacCallum, 2011). out[, "gcd", drop = FALSE] # Compute the changes for the paths from iv1 and iv2 to m1 out2 <- est_change(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2")) # Case influence: out2 # Note that only the changes in the selected parameters are included. # The generalized Cook's distance is computed only from the selected # parameter estimates. # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the standardized changes in parameter estimates # if a case is included vs. if a case is excluded. # That is, case influence on parameter estimates, standardized. # For free loadings only out <- est_change(fit_rerun, parameters = "=~") out # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if a case is excluded. # That is, standardized case influence on parameter estimates. # For structural paths only out <- est_change(fit_rerun, parameters = "~") out
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 4, 7, 9)) # Compute the standardized changes in parameter estimates # if a case is included vs. if this case is excluded. # That is, case influence on parameter estimates, standardized. out <- est_change(fit_rerun) # Case influence: out # Note that these are the differences divided by the standard errors # The rightmost column, `gcd`, contains the # generalized Cook's distances (Pek & MacCallum, 2011). out[, "gcd", drop = FALSE] # Compute the changes for the paths from iv1 and iv2 to m1 out2 <- est_change(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2")) # Case influence: out2 # Note that only the changes in the selected parameters are included. # The generalized Cook's distance is computed only from the selected # parameter estimates. # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the standardized changes in parameter estimates # if a case is included vs. if a case is excluded. # That is, case influence on parameter estimates, standardized. # For free loadings only out <- est_change(fit_rerun, parameters = "=~") out # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if a case is excluded. # That is, standardized case influence on parameter estimates. # For structural paths only out <- est_change(fit_rerun, parameters = "~") out
Gets a lavaan::lavaan()
output and computes the
approximate standardized changes in selected parameters for each case
if included.
est_change_approx( fit, parameters = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
est_change_approx( fit, parameters = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
fit |
The output from |
parameters |
A character vector to specify the selected
parameters. Each parameter is named as in |
case_id |
If it is a character vector of length equals to the
number of cases (the number of rows in the data in |
allow_inadmissible |
If |
skip_all_checks |
If |
For each case, est_change_approx()
computes the
approximate differences in the estimates of selected parameters
with and without this case:
(Estimate with all case) - (Estimate without this case)
The differences are standardized by dividing the approximate raw differences by their standard errors. This is a measure of the standardized influence of a case on the parameter estimates if it is included.
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
The model is not refitted. Therefore, the result is only an
approximation of that of est_change()
. However, this
approximation is useful for identifying potentially influential
cases when the sample size is very large or the model takes a long
time to fit. This function can be used to identify potentially
influential cases quickly and then select them to conduct the
leave-one-out sensitivity analysis using lavaan_rerun()
and
est_change()
.
This function also computes the approximate generalized Cook's
distance (gCD). To avoid confusion, it is labelled gcd_approx
.
For the technical details, please refer to the vignette
on this approach: vignette("casewise_scores", package = "semfindr")
The approximate approach supports a model with equality constraints (available in 0.1.4.8 and later version).
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An est_change
-class object, which is
matrix with the number of columns equals to the number of
requested parameters plus one, the last column being the
approximate generalized Cook's
distance. The number of rows equal to the number
of cases. The row names are the case identification values used in
lavaan_rerun()
. The elements are approximate standardized
differences.
A print method is available for user-friendly output.
Idea by Mark Hok Chio Lai https://orcid.org/0000-0002-9196-7406, implemented by Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate standardized changes and gCD out_approx <- est_change_approx(fit) head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for the first 10 cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in chisq if a case is removed out <- est_change(fit_rerun) head(out) # Compare the results plot(out_approx[1:10, 1], out[, 1]) abline(a = 0, b = 1) plot(out_approx[1:10, 2], out[, 2]) abline(a = 0, b = 1) plot(out_approx[1:10, 3], out[, 3]) abline(a = 0, b = 1) plot(out_approx[1:10, "gcd_approx"], out[, "gcd"]) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) summary(fit) # Approximate standardized changes and gCD # Compute gCD only for free loadings out_approx <- est_change_approx(fit, parameters = "=~") head(out_approx) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate standardized changes and gCD # Compute gCD only for structural paths out_approx <- est_change_approx(fit, parameters = "~") head(out_approx)
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate standardized changes and gCD out_approx <- est_change_approx(fit) head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for the first 10 cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in chisq if a case is removed out <- est_change(fit_rerun) head(out) # Compare the results plot(out_approx[1:10, 1], out[, 1]) abline(a = 0, b = 1) plot(out_approx[1:10, 2], out[, 2]) abline(a = 0, b = 1) plot(out_approx[1:10, 3], out[, 3]) abline(a = 0, b = 1) plot(out_approx[1:10, "gcd_approx"], out[, "gcd"]) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) summary(fit) # Approximate standardized changes and gCD # Compute gCD only for free loadings out_approx <- est_change_approx(fit, parameters = "=~") head(out_approx) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate standardized changes and gCD # Compute gCD only for structural paths out_approx <- est_change_approx(fit, parameters = "~") head(out_approx)
Gets the output of
functions such as est_change()
and
est_change_approx()
and plots case
influence on selected parameters.
est_change_plot( change, parameters, cutoff_change = NULL, largest_change = 1, title = TRUE, point_aes = list(), vline_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list(), wrap_aes = list() ) est_change_gcd_plot( change, parameters, cutoff_gcd = NULL, largest_gcd = 1, cutoff_change = NULL, largest_change = 1, title = TRUE, point_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list(), wrap_aes = list() )
est_change_plot( change, parameters, cutoff_change = NULL, largest_change = 1, title = TRUE, point_aes = list(), vline_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list(), wrap_aes = list() ) est_change_gcd_plot( change, parameters, cutoff_gcd = NULL, largest_gcd = 1, cutoff_change = NULL, largest_change = 1, title = TRUE, point_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list(), wrap_aes = list() )
change |
The output from
|
parameters |
If it is
a character vector, it
specifies the selected parameters.
Each parameter is named as in
|
cutoff_change |
Cases with
absolute changes larger than this
value will be labeled. Default is
|
largest_change |
The number of cases with the largest absolute changes to be labelled. Default is
|
title |
If |
point_aes |
A named list of
arguments to be passed to
|
vline_aes |
A named list of
arguments to be passed to
|
hline_aes |
A named list of
arguments to be passed to
|
cutoff_line_aes |
A named list
of arguments to be passed to
|
case_label_aes |
A named list of
arguments to be passed to
|
wrap_aes |
A named list of
arguments to be passed to
|
cutoff_gcd |
Cases with
generalized Cook's distance or
approximate generalized Cook's
distance larger than this value will
be labeled. Default is |
largest_gcd |
The number of cases with the largest generalized Cook's distance or approximate generalized Cook's distance to be labelled. Default is 1. If not an integer, it will be rounded to the nearest integer. |
The output of
est_change()
, est_change_raw()
,
est_change_approx()
, and
est_change_raw_approx()
is simply a
matrix. Therefore, these functions
will work for any matrix provided.
Row number will be used on the x-axis
if applicable. However, case
identification values will be used
for labeling individual cases if they
are stored as row names.
The default settings for the plots
should be good enough for diagnostic
purpose. If so desired, users can
use the *_aes
arguments to nearly
fully customize all the major
elements of the plots, as they would
do for building a ggplot2
plot.
A ggplot2
plot. Plotted by
default. If assigned to a variable or
called inside a function, it will not
be plotted. Use plot()
to plot it.
est_change_plot()
: Index
plot of case influence on parameters.
est_change_gcd_plot()
: Plot case
influence on parameter estimates
against generalized Cook's distance.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
est_change()
,
est_change_raw()
,
est_change_approx()
, and
est_change_raw_approx()
.
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute approximate case influence on parameters estimates out <- est_change_approx(fit) # Plot case influence for all regression coefficients est_change_plot(out, parameters = "~", largest_change = 2) # Plot case influence against approximated gCD for all # regression coefficients # Label top 5 cases with largest approximated gCD est_change_gcd_plot(out, parameters = "~", largest_gcd = 5) # Customize elements in a plot. # For example, change the color and shape of the points. est_change_plot(out, parameters = "~", largest_change = 2, point_aes = list(shape = 5, color = "red"))
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute approximate case influence on parameters estimates out <- est_change_approx(fit) # Plot case influence for all regression coefficients est_change_plot(out, parameters = "~", largest_change = 2) # Plot case influence against approximated gCD for all # regression coefficients # Label top 5 cases with largest approximated gCD est_change_gcd_plot(out, parameters = "~", largest_gcd = 5) # Customize elements in a plot. # For example, change the color and shape of the points. est_change_plot(out, parameters = "~", largest_change = 2, point_aes = list(shape = 5, color = "red"))
Gets a lavaan_rerun()
output and computes the
changes in selected parameters for each case if included.
est_change_raw( rerun_out, parameters = NULL, standardized = FALSE, user_defined_label_full = FALSE )
est_change_raw( rerun_out, parameters = NULL, standardized = FALSE, user_defined_label_full = FALSE )
rerun_out |
The output from |
parameters |
A character vector to specify the selected
parameters. Each parameter is named as in |
standardized |
If |
user_defined_label_full |
Logical. If |
For each case, est_change_raw()
computes the differences
in the estimates of selected parameters with and without this
case:
(Estimate with all case) - (Estimate without this case).
The change is the raw change, either for the standardized or unstandardized solution. The change is not divided by standard error. This is a measure of the influence of a case on the parameter estimates if it is included.
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
If the analysis is not admissible or did not converge when a case
is deleted, NA
s will be returned for this case on the
differences.
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An est_change
-class object, which is
matrix with the number of columns equals to the number of
requested parameters, and the number of rows equals to the number
of cases. The row names are the case identification values used in
lavaan_rerun()
. The elements are the raw differences.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) # Fit the model n times. Each time with one case is removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(3, 5, 7, 8)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. out <- est_change_raw(fit_rerun) # Results excluding a case out # Note that these are the differences in parameter estimates. # The parameter estimates from all cases (coef_all <- coef(fit)) # The parameter estimates from manually deleting the third case fit_no_3 <- lavaan::sem(mod, dat[-3, ]) (coef_no_3 <- coef(fit_no_3)) # The differences coef_all - coef_no_3 # The first row of `est_change_raw(fit_rerun)` round(out[1, ], 3) # Compute only the changes of the paths from iv1 and iv2 to m1 out2 <- est_change_raw(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2")) # Results excluding a case out2 # Note that only the changes in the selected paths are included. # Use standardized = TRUE to compare the differences in standardized solution out2_std <- est_change_raw(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2"), standardized = TRUE) out2_std (est_std_all <- parameterEstimates(fit, standardized = TRUE)[1:2, c("lhs", "op", "rhs", "std.all")]) (est_std_no_1 <- parameterEstimates(fit_no_3, standardized = TRUE)[1:2, c("lhs", "op", "rhs", "std.all")]) # The differences est_std_all$std.all - est_std_no_1$std.all # The first row of `out2_std` out2_std[1, ] # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. # For free loadings only out <- est_change_raw(fit_rerun, parameters = "=~") out # For standardized loadings only out_std <- est_change_raw(fit_rerun, parameters = "=~", standardized = TRUE) out_std # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. # For structural paths only out <- est_change_raw(fit_rerun, parameters = "~") out # For standardized paths only out_std <- est_change_raw(fit_rerun, parameters = "~", standardized = TRUE) out_std
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) # Fit the model n times. Each time with one case is removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(3, 5, 7, 8)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. out <- est_change_raw(fit_rerun) # Results excluding a case out # Note that these are the differences in parameter estimates. # The parameter estimates from all cases (coef_all <- coef(fit)) # The parameter estimates from manually deleting the third case fit_no_3 <- lavaan::sem(mod, dat[-3, ]) (coef_no_3 <- coef(fit_no_3)) # The differences coef_all - coef_no_3 # The first row of `est_change_raw(fit_rerun)` round(out[1, ], 3) # Compute only the changes of the paths from iv1 and iv2 to m1 out2 <- est_change_raw(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2")) # Results excluding a case out2 # Note that only the changes in the selected paths are included. # Use standardized = TRUE to compare the differences in standardized solution out2_std <- est_change_raw(fit_rerun, c("m1 ~ iv1", "m1 ~ iv2"), standardized = TRUE) out2_std (est_std_all <- parameterEstimates(fit, standardized = TRUE)[1:2, c("lhs", "op", "rhs", "std.all")]) (est_std_no_1 <- parameterEstimates(fit_no_3, standardized = TRUE)[1:2, c("lhs", "op", "rhs", "std.all")]) # The differences est_std_all$std.all - est_std_no_1$std.all # The first row of `out2_std` out2_std[1, ] # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. # For free loadings only out <- est_change_raw(fit_rerun, parameters = "=~") out # For standardized loadings only out_std <- est_change_raw(fit_rerun, parameters = "=~", standardized = TRUE) out_std # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, case influence on parameter estimates. # For structural paths only out <- est_change_raw(fit_rerun, parameters = "~") out # For standardized paths only out_std <- est_change_raw(fit_rerun, parameters = "~", standardized = TRUE) out_std
Gets a lavaan::lavaan()
output and computes the
approximate changes in selected parameters for each case
if included.
est_change_raw_approx( fit, parameters = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
est_change_raw_approx( fit, parameters = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
fit |
The output from |
parameters |
A character vector to specify the selected
parameters. Each parameter is named as in |
case_id |
If it is a character vector of length equals to the
number of cases (the number of rows in the data in |
allow_inadmissible |
If |
skip_all_checks |
If |
For each case, est_change_raw_approx()
computes the
approximate differences
in the estimates of selected parameters with and without this
case:
(Estimate with all case) - (Estimate without this case).
The change is the approximate raw change. The change is not divided by the standard error of an estimate (hence "raw" in the function name). This is a measure of the influence of a case on the parameter estimates if it is included.
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
The model is not refitted. Therefore, the result is only an
approximation of that of est_change_raw()
. However, this
approximation is useful for identifying potentially influential
cases when the sample size is very large or the model takes a long
time to fit. This function can be used to identify potentially
influential cases quickly and then select them to conduct the
leave-one-out sensitivity analysis using lavaan_rerun()
and
est_change_raw()
.
Unlike est_change_raw()
, it does not yet support computing the
changes for the standardized solution.
For the technical details, please refer to the vignette
on this approach: vignette("casewise_scores", package = "semfindr")
The approximate approach supports a model with equality constraints (available in 0.1.4.8 and later version).
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An est_change
-class object, which is
matrix with the number of columns equals to the number of
requested parameters, and the number of rows equals to the number
of cases. The row names are case identification values. The
elements are the raw differences.
A print method is available for user-friendly output.
Idea by Mark Hok Chio Lai https://orcid.org/0000-0002-9196-7406, implemented by Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the approximate case influence on parameter estimates. out_approx <- est_change_raw_approx(fit) head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for 10 selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the case influence on the parameter estimates. out <- est_change_raw(fit_rerun) out # Compare the results plot(out_approx[1:10, 1], out[, 1]) abline(a = 0, b = 1) plot(out_approx[1:10, 5], out[, 5]) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, approximate case influence on parameter estimates. # Compute changes for free loadings only. out_approx <- est_change_raw_approx(fit, parameters = "=~") head(out_approx) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the approximate case influence on parameter estimates. # Compute changes for structural paths only out_approx <- est_change_raw_approx(fit, parameters = c("~")) head(out_approx)
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the approximate case influence on parameter estimates. out_approx <- est_change_raw_approx(fit) head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for 10 selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the case influence on the parameter estimates. out <- est_change_raw(fit_rerun) out # Compare the results plot(out_approx[1:10, 1], out[, 1]) abline(a = 0, b = 1) plot(out_approx[1:10, 5], out[, 5]) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, approximate case influence on parameter estimates. # Compute changes for free loadings only. out_approx <- est_change_raw_approx(fit, parameters = "=~") head(out_approx) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the approximate changes in parameter estimates if a case is included # vs. if this case is excluded. # That is, the approximate case influence on parameter estimates. # Compute changes for structural paths only out_approx <- est_change_raw_approx(fit, parameters = c("~")) head(out_approx)
Gets a lavaan_rerun()
output and computes the changes
in selected fit measures if a case is included.
fit_measures_change( rerun_out, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL )
fit_measures_change( rerun_out, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL )
rerun_out |
The output from |
fit_measures |
The argument |
baseline_model |
The argument |
For each case, fit_measures_change()
computes the
differences in selected fit measures with and without this case:
(Fit measure with all case) - (Fit measure without this case).
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
Note that an increase is an improvement in fit for goodness of fit measures such as CFI and TLI, but a decrease is an improvement in fit for badness of fit measures such as RMSEA and model chi-square. This is a measure of the influence of a case on a fit measure if it is included.
If the analysis is not admissible or does not converge when a case
is deleted, NA
s will be turned for the differences of this
case.
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An fit_measures_change
-class object, which is
matrix with the number of columns equals to the number of
requested fit measures, and the number of rows equals to the number
of cases. The row names are the case identification values used in
lavaan_rerun()
.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in chisq if a case is included # vs. if this case is removed. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Results excluding a case, for the first few cases head(out) # Chi-square will all cases included. (chisq_all <- fitMeasures(fit, c("chisq"))) # Chi-square with the first case removed fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) # Difference chisq_all - chisq_no_1 # Compare to the result from the fit_measures_change out[1, ] # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) out <- fit_measures_change(fit_rerun, fit_measures = "chisq") head(out) (chisq_all <- fitMeasures(fit, c("chisq"))) fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) chisq_all - chisq_no_1 out[1, ] # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) out <- fit_measures_change(fit_rerun, fit_measures = "chisq") head(out) (chisq_all <- fitMeasures(fit, c("chisq"))) fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) chisq_all - chisq_no_1 out[1, ]
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the changes in chisq if a case is included # vs. if this case is removed. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Results excluding a case, for the first few cases head(out) # Chi-square will all cases included. (chisq_all <- fitMeasures(fit, c("chisq"))) # Chi-square with the first case removed fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) # Difference chisq_all - chisq_no_1 # Compare to the result from the fit_measures_change out[1, ] # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) out <- fit_measures_change(fit_rerun, fit_measures = "chisq") head(out) (chisq_all <- fitMeasures(fit, c("chisq"))) fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) chisq_all - chisq_no_1 out[1, ] # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) out <- fit_measures_change(fit_rerun, fit_measures = "chisq") head(out) (chisq_all <- fitMeasures(fit, c("chisq"))) fit_01 <- lavaan::sem(mod, dat[-1, ]) (chisq_no_1 <- fitMeasures(fit_01, c("chisq"))) chisq_all - chisq_no_1 out[1, ]
Gets a lavaan::lavaan()
output and computes the
approximate change
in selected fit measures if a case is included.
fit_measures_change_approx( fit, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
fit_measures_change_approx( fit, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL, case_id = NULL, allow_inadmissible = FALSE, skip_all_checks = FALSE )
fit |
The output from |
fit_measures |
The argument |
baseline_model |
The argument |
case_id |
If it is a character vector of length equals to the
number of cases (the number of rows in the data in |
allow_inadmissible |
If |
skip_all_checks |
If |
For each case, fit_measures_change_approx()
computes the
approximate differences in selected fit measures with and
without this case:
(Fit measure with all case) - (Fit measure without this case).
If the value of a case is positive, including the case increases an estimate.
If the value of a case is negative, including the case decreases an estimate.
Note that an increase is an improvement in fit for goodness of fit measures such as CFI and TLI, but a decrease is an improvement in fit for badness of fit measures such as RMSEA and model chi-square. This is a measure of the influence of a case on a fit measure if it is included.
The model is not refitted. Therefore, the result is only an
approximation of that of fit_measures_change()
. However, this
approximation is useful for identifying potentially influential
cases when the sample size is very large or the model takes a long
time to fit. This function can be used to identify potentially
influential cases quickly and then select them to conduct the
leave-one-out sensitivity analysis using lavaan_rerun()
and
fit_measures_change()
.
For the technical details, please refer to the vignette
on this approach: vignette("casewise_scores", package = "semfindr")
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An fit_measures_change
-class object, which is
matrix with the number of columns equals to the number of
requested fit measures, and the number of rows equals to the number
of cases. The row names are case identification values.
A print method is available for user-friendly output.
Idea by Mark Hok Chio Lai https://orcid.org/0000-0002-9196-7406, implemented by Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate changes out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is included # vs. if this case is excluded. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Case influence, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is included # vs. if this case is excluded. # That is, case influence on fit measures. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Results excluding a case, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is excluded # vs. if this case is included. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Case influence, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1)
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate changes out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is included # vs. if this case is excluded. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Case influence, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is included # vs. if this case is excluded. # That is, case influence on fit measures. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Results excluding a case, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::sem(mod, dat) out_approx <- fit_measures_change_approx(fit, fit_measures = "chisq") head(out_approx) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:5) # Compute the changes in chisq if a case is excluded # vs. if this case is included. # That is, case influence on model chi-squared. out <- fit_measures_change(fit_rerun, fit_measures = "chisq") # Case influence, for the first few cases head(out) # Compare the results plot(out_approx[1:5, "chisq"], out) abline(a = 0, b = 1)
Gets a lavaan::lavaan()
output and computes the
implied scores of observed outcome variables.
implied_scores(fit, output = "matrix", skip_all_checks = FALSE)
implied_scores(fit, output = "matrix", skip_all_checks = FALSE)
fit |
The output from |
output |
Output type. If |
skip_all_checks |
If |
The implied scores for each observed outcome variable
(the y
-variables or the endogenous variables) are
simply computed in the same way the predicted scores in a linear
regression model are computed.
Currently it supports only single-group and multiple-group path analysis models with only observed variables. (Support for multiple-group models available in 0.1.4.8 and later version).
A matrix of the implied scores if output
is "matrix"
.
If output
is "list"
, a list of matrices of the implied scores.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the implied scores for `m1` and `dv` fit_implied_scores <- implied_scores(fit) head(fit_implied_scores)
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Compute the implied scores for `m1` and `dv` fit_implied_scores <- implied_scores(fit) head(fit_implied_scores)
A generic index plot function for plotting values of a column # in a matrix.
index_plot( object, column = NULL, plot_title = "Index Plot", x_label = NULL, cutoff_x_low = NULL, cutoff_x_high = NULL, largest_x = 1, absolute = FALSE, point_aes = list(), vline_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() )
index_plot( object, column = NULL, plot_title = "Index Plot", x_label = NULL, cutoff_x_low = NULL, cutoff_x_high = NULL, largest_x = 1, absolute = FALSE, point_aes = list(), vline_aes = list(), hline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() )
object |
A matrix-like object,
such as the output from
|
column |
String. The column name of the values to be plotted. |
plot_title |
The title of the
plot. Default is |
x_label |
The Label for the
vertical axis, for the value of
|
cutoff_x_low |
Cases with values
smaller than this value will be labeled.
A cutoff line will be drawn at this
value.
Default is |
cutoff_x_high |
Cases with values
larger than this value will be labeled.
A cutoff line will be drawn at this
value.
Default is |
largest_x |
The number of cases with the largest absolute value on 'column“ to be labelled. Default is 1. If not an integer, it will be rounded to the nearest integer. |
absolute |
Whether absolute values
will be plotted. Useful when cases
are to be compared on magnitude,
ignoring sign. Default is |
point_aes |
A named list of
arguments to be passed to
|
vline_aes |
A named list of
arguments to be passed to
|
hline_aes |
A named list of
arguments to be passed to
|
cutoff_line_aes |
A named list
of arguments to be passed to
|
case_label_aes |
A named list of
arguments to be passed to
|
This index plot function is for plotting any measure of influence or extremeness in a matrix. It can be used for measures not supported with other functions.
Like functions such as gcd_plot()
and est_change_plot()
, it supports
labelling cases based on the values
on the selected measure
(originaL values or absolute values).
Users can also plot cases based on the absolute values. This is useful when cases are to be compared on magnitude, ignoring the sign.
A ggplot2
plot. Plotted by
default. If assigned to a variable
or called inside a function, it will
not be plotted. Use plot()
to
plot it.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
influence_stat()
, est_change()
,
est_change_raw()
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) # Plot case influence on chi-square. Label the 3 cases with the influence. index_plot(out, "chisq", largest_x = 3) # Plot absolute case influence on chi-square. index_plot(out, "chisq", absolute = TRUE)
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) # Plot case influence on chi-square. Label the 3 cases with the influence. index_plot(out, "chisq", largest_x = 3) # Plot absolute case influence on chi-square. index_plot(out, "chisq", absolute = TRUE)
Gets an influence_stat()
output and plots selected
statistics.
gcd_plot( influence_out, cutoff_gcd = NULL, largest_gcd = 1, point_aes = list(), vline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() ) md_plot( influence_out, cutoff_md = FALSE, cutoff_md_qchisq = 0.975, largest_md = 1, point_aes = list(), vline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() ) gcd_gof_plot( influence_out, fit_measure, cutoff_gcd = NULL, cutoff_fit_measure = NULL, largest_gcd = 1, largest_fit_measure = 1, point_aes = list(), hline_aes = list(), cutoff_line_gcd_aes = list(), cutoff_line_fit_measures_aes = list(), case_label_aes = list() ) gcd_gof_md_plot( influence_out, fit_measure, cutoff_md = FALSE, cutoff_fit_measure = NULL, circle_size = 2, cutoff_md_qchisq = 0.975, cutoff_gcd = NULL, largest_gcd = 1, largest_md = 1, largest_fit_measure = 1, point_aes = list(), hline_aes = list(), cutoff_line_md_aes = list(), cutoff_line_gcd_aes = list(), cutoff_line_fit_measures_aes = list(), case_label_aes = list() )
gcd_plot( influence_out, cutoff_gcd = NULL, largest_gcd = 1, point_aes = list(), vline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() ) md_plot( influence_out, cutoff_md = FALSE, cutoff_md_qchisq = 0.975, largest_md = 1, point_aes = list(), vline_aes = list(), cutoff_line_aes = list(), case_label_aes = list() ) gcd_gof_plot( influence_out, fit_measure, cutoff_gcd = NULL, cutoff_fit_measure = NULL, largest_gcd = 1, largest_fit_measure = 1, point_aes = list(), hline_aes = list(), cutoff_line_gcd_aes = list(), cutoff_line_fit_measures_aes = list(), case_label_aes = list() ) gcd_gof_md_plot( influence_out, fit_measure, cutoff_md = FALSE, cutoff_fit_measure = NULL, circle_size = 2, cutoff_md_qchisq = 0.975, cutoff_gcd = NULL, largest_gcd = 1, largest_md = 1, largest_fit_measure = 1, point_aes = list(), hline_aes = list(), cutoff_line_md_aes = list(), cutoff_line_gcd_aes = list(), cutoff_line_fit_measures_aes = list(), case_label_aes = list() )
influence_out |
The output from |
cutoff_gcd |
Cases with generalized Cook's distance or
approximate generalized Cook's distance larger
than this value will be labeled. Default is |
largest_gcd |
The number of cases with the largest generalized Cook's distance or approximate generalized Cook's distance to be labelled. Default is 1. If not an integer, it will be rounded to the nearest integer. |
point_aes |
A named list of
arguments to be passed to
|
vline_aes |
A named list of
arguments to be passed to
|
cutoff_line_aes |
A named list
of arguments to be passed to
|
case_label_aes |
A named list of
arguments to be passed to
|
cutoff_md |
Cases with Mahalanobis distance larger than this
value will be labeled. If it is |
cutoff_md_qchisq |
This value multiplied by 100 is the percentile to be used for labeling case based on Mahalanobis distance. Default is .975. |
largest_md |
The number of cases with the largest Mahalanobis distance to be labelled. Default is 1. If not an integer, it will be rounded to the nearest integer. |
fit_measure |
The fit measure to be used in a
plot. Use the name in the |
cutoff_fit_measure |
Cases with |
largest_fit_measure |
The number of cases with the largest selected fit measure change in magnitude to be labelled. Default is
|
hline_aes |
A named list of
arguments to be passed to
|
cutoff_line_gcd_aes |
Similar
to |
cutoff_line_fit_measures_aes |
Similar
to |
circle_size |
The size of the largest circle when the size of a circle is controlled by a statistic. |
cutoff_line_md_aes |
Similar
to |
The output of influence_stat()
is simply a matrix.
Therefore, these functions will work for any matrix provided. Row
number will be used on the x-axis if applicable. However, case
identification values in the output from influence_stat()
will
be used for labeling individual cases.
The default settings for the plots
should be good enough for diagnostic
purpose. If so desired, users can
use the *_aes
arguments to nearly
fully customize all the major
elements of the plots, as they would
do for building a ggplot2
plot.
A ggplot2
plot. Plotted by default. If assigned to a variable
or called inside a function, it will not be plotted. Use plot()
to
plot it.
gcd_plot()
: Index plot of generalized Cook's distance.
md_plot()
: Index plot of Mahalanobis distance.
gcd_gof_plot()
: Plot the case influence of the selected fit
measure against generalized Cook's distance.
gcd_gof_md_plot()
: Bubble plot of the case influence of the selected
fit measure against Mahalanobis distance, with the size of a bubble
determined by generalized Cook's distance.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) head(out) # Plot generalized Cook's distance. Label the 3 cases with the largest distances. gcd_plot(out, largest_gcd = 3) # Plot Mahalanobis distance. Label the 3 cases with the largest distances. md_plot(out, largest_md = 3) # Plot case influence on model chi-square against generalized Cook's distance. # Label the 3 cases with the largest absolute influence. # Label the 3 cases with the largest generalized Cook's distance. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3) # Plot case influence on model chi-square against Mahalanobis distance. # Size of bubble determined by generalized Cook's distance. # Label the 3 cases with the largest absolute influence. # Label the 3 cases with the largest Mahalanobis distance. # Label the 3 cases with the largest generalized Cook's distance. gcd_gof_md_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, largest_md = 3, circle_size = 10) # Use the approximate method that does not require refitting the model. # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) out <- influence_stat(fit) head(out) # Plot approximate generalized Cook's distance. # Label the 3 cases with the largest values. gcd_plot(out, largest_gcd = 3) # Plot Mahalanobis distance. # Label the 3 cases with the largest values. md_plot(out, largest_md = 3) # Plot approximate case influence on model chi-square against # approximate generalized Cook's distance. # Label the 3 cases with the largest absolute approximate case influence. # Label the 3 cases with the largest approximate generalized Cook's distance. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3) # Plot approximate case influence on model chi-square against Mahalanobis distance. # The size of a bubble determined by approximate generalized Cook's distance. # Label the 3 cases with the largest absolute approximate case influence. # Label the 3 cases with the largest Mahalanobis distance. # Label the 3 cases with the largest approximate generalized Cook's distance. gcd_gof_md_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, largest_md = 3, circle_size = 10) # Customize elements in the plot. # For example, change the color and shape of the points. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, point_aes = list(shape = 3, color = "red"))
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) head(out) # Plot generalized Cook's distance. Label the 3 cases with the largest distances. gcd_plot(out, largest_gcd = 3) # Plot Mahalanobis distance. Label the 3 cases with the largest distances. md_plot(out, largest_md = 3) # Plot case influence on model chi-square against generalized Cook's distance. # Label the 3 cases with the largest absolute influence. # Label the 3 cases with the largest generalized Cook's distance. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3) # Plot case influence on model chi-square against Mahalanobis distance. # Size of bubble determined by generalized Cook's distance. # Label the 3 cases with the largest absolute influence. # Label the 3 cases with the largest Mahalanobis distance. # Label the 3 cases with the largest generalized Cook's distance. gcd_gof_md_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, largest_md = 3, circle_size = 10) # Use the approximate method that does not require refitting the model. # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) out <- influence_stat(fit) head(out) # Plot approximate generalized Cook's distance. # Label the 3 cases with the largest values. gcd_plot(out, largest_gcd = 3) # Plot Mahalanobis distance. # Label the 3 cases with the largest values. md_plot(out, largest_md = 3) # Plot approximate case influence on model chi-square against # approximate generalized Cook's distance. # Label the 3 cases with the largest absolute approximate case influence. # Label the 3 cases with the largest approximate generalized Cook's distance. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3) # Plot approximate case influence on model chi-square against Mahalanobis distance. # The size of a bubble determined by approximate generalized Cook's distance. # Label the 3 cases with the largest absolute approximate case influence. # Label the 3 cases with the largest Mahalanobis distance. # Label the 3 cases with the largest approximate generalized Cook's distance. gcd_gof_md_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, largest_md = 3, circle_size = 10) # Customize elements in the plot. # For example, change the color and shape of the points. gcd_gof_plot(out, fit_measure = "chisq", largest_gcd = 3, largest_fit_measure = 3, point_aes = list(shape = 3, color = "red"))
Gets a lavaan_rerun()
output and computes the changes
in selected parameters and fit measures for each case if included.
influence_stat( rerun_out, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL, parameters = NULL, mahalanobis = TRUE, keep_fit = TRUE )
influence_stat( rerun_out, fit_measures = c("chisq", "cfi", "rmsea", "tli"), baseline_model = NULL, parameters = NULL, mahalanobis = TRUE, keep_fit = TRUE )
rerun_out |
The output from |
fit_measures |
The argument |
baseline_model |
The argument |
parameters |
A character vector to specify the selected
parameters. Each parameter is named as in |
mahalanobis |
If |
keep_fit |
If |
For each case, influence_stat()
computes the differences
in the estimates of selected parameters and fit measures with and
without this case. Users can also request a measure of extremeness (only
Mahalanobis distance is available for now).
If rerun_out
is the output of lavaan_rerun()
, it will use the
leave-one-out approach.
Measures are computed by est_change()
and fit_measures_change()
.
If rerun_out
is the output of lavaan::lavaan()
or its wrappers
(e.g., lavaan::cfa()
or lavaan::sem()
), it will use the
approximate approach.
Measures are computed by est_change_approx()
and
fit_measures_change_approx()
.
If Mahalanobis distance is requested, it is computed by
mahalanobis_rerun()
.
Please refer to the help pages of the above functions on the technical details.
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
An influence_stat
-class object, which is
a matrix with the number of columns equals to the number of
requested statistics, and the number of rows equals to the number of
cases. The row names are the case identification values used in
lavaan_rerun()
. Please refer to the help pages of est_change()
and
fit_measures_change()
(or est_change_approx()
and
fit_measures_change_approx()
for details. This object
has a print method for printing user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
fit_measures_change()
, est_change()
, and mahalanobis_rerun()
.
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) head(out) # --- Approximate Approach out_approx <- influence_stat(fit) head(out_approx)
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) head(out) # --- Approximate Approach out_approx <- influence_stat(fit) head(out_approx)
Reruns a lavaan
analysis several
times, each time with one case removed.
lavaan_rerun( fit, case_id = NULL, to_rerun, md_top, resid_md_top, allow_inadmissible = FALSE, skip_all_checks = FALSE, parallel = FALSE, makeCluster_args = list(spec = getOption("cl.cores", 2)), rerun_method = c("lavaan", "update") )
lavaan_rerun( fit, case_id = NULL, to_rerun, md_top, resid_md_top, allow_inadmissible = FALSE, skip_all_checks = FALSE, parallel = FALSE, makeCluster_args = list(spec = getOption("cl.cores", 2)), rerun_method = c("lavaan", "update") )
fit |
The output from |
case_id |
If it is a character vector of length equals to the
number of cases (the number of rows in the data in |
to_rerun |
The cases to be processed. If |
md_top |
The number of cases to be processed based on the
Mahalanobis distance computed on all observed variables used in
the model. The cases will be ranked from the largest to the
smallest distance, and the top |
resid_md_top |
The number of cases to be processed based on
the Mahalanobis distance computed from the residuals of outcome
variables. The cases will be ranked from the largest to the
smallest distance, and the top |
allow_inadmissible |
If |
skip_all_checks |
If |
parallel |
Whether parallel will be used. If |
makeCluster_args |
A named list of arguments to be passed to
|
rerun_method |
How fit will be rerun. Default is
|
lavaan_rerun()
gets an lavaan::lavaan()
output and
reruns the analysis n0 times, using the same arguments and
options in the output, n0 equals to the number of cases selected,
by default all cases in the analysis. In each
run, one case will be removed.
Optionally, users can rerun the analysis with only selected cases
removed. These cases can be specified by case IDs, by Mahalanobis
distance computed from all variables used in the model, or by
Mahalanobis distance computed from the residuals (observed score -
implied scores) of observed outcome variables. See the help on the
arguments to_rerun
, md_top
, and resid_md_top
.
It is not recommended to use Mahalanobis distance computed from all variables, especially for models with observed variables as predictors (Pek & MacCallum, 2011). Cases that are extreme on predictors may not be influential on the parameter estimates. Nevertheless, this distance is reported in some SEM programs and so this option is provided.
Mahalanobis distance based on residuals are supported for models
with no latent factors. The implied scores are computed by
implied_scores()
.
If the sample size is large, it is recommended to use parallel
processing. However, it is possible that parallel
processing will fail. If this is the case, try to use serial
processing, by simply removing the argument parallel
or set it to
FALSE
.
Many other functions in semfindr use the output from
lavaan_rerun()
. Instead of running the n analyses every time, do
this step once and then users can compute whatever influence
statistics they want quickly.
If the analysis took a few minutes to run due to the large number
of cases or the long processing time in fitting the model, it is
recommended to save the output to an external file (e.g., by
base::saveRDS()
).
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
A lavaan_rerun
-class object, which is a list with the following elements:
rerun
: The n lavaan
output objects.
fit
: The original output from lavaan
.
post_check
: A list of length equals to n. Each analysis was
checked by lavaan::lavTech(x, "post.check")
, x
being the
lavaan
results. The results of this test are stored in this
list. If the value is TRUE
, the estimation converged and the
solution is admissible. If not TRUE
, it is a warning message
issued by lavaan::lavTech()
.
converged
: A vector of length equals to n. Each analysis was
checked by lavaan::lavTech(x, "converged")
, x
being the
lavaan
results. The results of this test are stored in this
vector. If the value is TRUE
, the estimation converged. If
not TRUE
, then the estimation failed to converge if the corresponding
case is excluded.
call
: The call to lavaan_rerun()
.
selected
: A numeric vector of the row numbers of cases selected
in the analysis. Its length should be equal to the length of
rerun
.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. fit_rerun <- lavaan_rerun(fit, parallel = FALSE) # Print the output for a brief description of the runs fit_rerun # Results excluding the first case fitMeasures(fit_rerun$rerun[[1]], c("chisq", "cfi", "tli", "rmsea")) # Results by manually excluding the first case fit_01 <- lavaan::sem(mod, dat[-1, ]) fitMeasures(fit_01, c("chisq", "cfi", "tli", "rmsea"))
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. fit_rerun <- lavaan_rerun(fit, parallel = FALSE) # Print the output for a brief description of the runs fit_rerun # Results excluding the first case fitMeasures(fit_rerun$rerun[[1]], c("chisq", "cfi", "tli", "rmsea")) # Results by manually excluding the first case fit_01 <- lavaan::sem(mod, dat[-1, ]) fitMeasures(fit_01, c("chisq", "cfi", "tli", "rmsea"))
Gets a 'lavaan' output and checks whether it is
supported by lavaan_rerun()
.
lavaan_rerun_check(fit, print_messages = TRUE)
lavaan_rerun_check(fit, print_messages = TRUE)
fit |
The output from |
print_messages |
Logical. If |
This function is not supposed to be used by users. It is
called by lavaan_rerun()
to see if the analysis being passed to
it is supported. If not, messages will be printed to indicate why.
A single-element vector. If confirmed to be supported, will return 0. If not confirmed be support but may still work, return 1. If confirmed to be not yet supported, will return a negative number, the value of this number without the negative sign is the number of tests failed.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
dat <- cfa_dat mod <- " f1 =~ x4 + x5 + x6 " dat_gp <- dat dat$gp <- rep(c("gp1", "gp2"), length.out = nrow(dat_gp)) fit01 <- lavaan::sem(mod, dat) # If supported, returns a zero. lavaan_rerun_check(fit01) fit05 <- lavaan::cfa(mod, dat, group = "gp") # If not supported, returns a negative number. lavaan_rerun_check(fit05)
dat <- cfa_dat mod <- " f1 =~ x4 + x5 + x6 " dat_gp <- dat dat$gp <- rep(c("gp1", "gp2"), length.out = nrow(dat_gp)) fit01 <- lavaan::sem(mod, dat) # If supported, returns a zero. lavaan_rerun_check(fit01) fit05 <- lavaan::cfa(mod, dat, group = "gp") # If not supported, returns a negative number. lavaan_rerun_check(fit05)
Gets a lavaan_rerun()
or lavaan::lavaan()
output
and computes the Mahalanobis distance for each case using only the
observed predictors.
mahalanobis_predictors( fit, emNorm_arg = list(estimate.worst = FALSE, criterion = 1e-06) )
mahalanobis_predictors( fit, emNorm_arg = list(estimate.worst = FALSE, criterion = 1e-06) )
fit |
It can be the output from |
emNorm_arg |
No longer used. Kept for backward compatibility. |
For each case, mahalanobis_predictors()
computes the
Mahalanobis distance of each case on the observed predictors.
If there are no missing values, stats::mahalanobis()
will be used
to compute the Mahalanobis distance.
If there are missing values on the observed predictors, the means
and variance-covariance matrices will be estimated by maximum
likelihood using lavaan::lavCor()
. The estimates will be passed
to modi::MDmiss()
to compute the Mahalanobis distance.
Supports both single-group and multiple-group models. For multiple-group models, the Mahalanobis distance for each case is computed using the means and covariance matrix of the group this case belongs to. (Support for multiple-group models available in 0.1.4.8 and later version).
A md_semfindr
-class object, which is
a one-column matrix (a column vector) of the Mahalanobis
distance for each case. The number of rows equals to the number of
cases in the data stored in the fit object.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Béguin, C., & Hulliger, B. (2004). Multivariate outlier detection in incomplete survey data: The epidemic algorithm and transformed rank correlations. Journal of the Royal Statistical Society: Series A (Statistics in Society), 167(2), 275-294.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 2, 49-55.
Schafer, J.L. (1997) Analysis of incomplete multivariate data. Chapman & Hall/CRC Press.
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases. dat <- dat[1:50, ] # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) md_predictors <- mahalanobis_predictors(fit) md_predictors
library(lavaan) dat <- pa_dat # For illustration, select only the first 50 cases. dat <- dat[1:50, ] # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) md_predictors <- mahalanobis_predictors(fit) md_predictors
Computes the Mahalanobis distance for each case on all observed variables in a model.
mahalanobis_rerun( fit, emNorm_arg = list(estimate.worst = FALSE, criterion = 1e-06) )
mahalanobis_rerun( fit, emNorm_arg = list(estimate.worst = FALSE, criterion = 1e-06) )
fit |
It can be the output from |
emNorm_arg |
No longer used. Kept for backward compatibility. |
mahalanobis_rerun()
gets a lavaan_rerun()
or
lavaan::lavaan()
output and computes the Mahalanobis distance for
each case on all observed variables.
If there are no missing values, stats::mahalanobis()
will be used
to compute the Mahalanobis distance.
If there are missing values on the observed predictors, the means
and variance-covariance matrices will be estimated by maximum
likelihood using lavaan::lavCor()
. The estimates will be passed
to modi::MDmiss()
to compute the Mahalanobis distance.
Supports both single-group and multiple-group models. For multiple-group models, the Mahalanobis distance for each case is computed using the means and covariance matrix of the group this case belongs to. (Support for multiple-group models available in 0.1.4.8 and later version).
A md_semfindr
-class object, which is
a one-column matrix (a column vector) of the Mahalanobis
distance for each case. The row names are the case identification
values used in lavaan_rerun()
.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 2, 49-55.
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the Mahalanobis distance for each case out <- mahalanobis_rerun(fit_rerun) # Results excluding a case, for the first few cases head(out) # Compute the Mahalanobis distance using stats::mahalanobis() md1 <- stats::mahalanobis(dat, colMeans(dat), stats::cov(dat)) # Compare the results head(md1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) mahalanobis_rerun(fit_rerun) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) mahalanobis_rerun(fit_rerun)
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the Mahalanobis distance for each case out <- mahalanobis_rerun(fit_rerun) # Results excluding a case, for the first few cases head(out) # Compute the Mahalanobis distance using stats::mahalanobis() md1 <- stats::mahalanobis(dat, colMeans(dat), stats::cov(dat)) # Compare the results head(md1) # A CFA model dat <- cfa_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f1 ~~ f2 " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) mahalanobis_rerun(fit_rerun) # A latent variable model dat <- sem_dat mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " # Fit the model fit <- lavaan::cfa(mod, dat) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) mahalanobis_rerun(fit_rerun)
A four-variable dataset with 100 cases.
pa_dat
pa_dat
A data frame with 100 rows and 5 variables:
Mediator. Numeric.
Outcome variable. Numeric.
Predictor. Numeric.
Predictor. Numeric.
library(lavaan) data(pa_dat) mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " fit <- sem(mod, pa_dat) summary(fit)
library(lavaan) data(pa_dat) mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " fit <- sem(mod, pa_dat) summary(fit)
A four-variable dataset with 100 cases, with one influential case.
pa_dat2
pa_dat2
A data frame with 100 rows and 5 variables:
Case ID. Character.
Predictor. Numeric.
Predictor. Numeric.
Mediator. Numeric.
Outcome variable. Numeric.
library(lavaan) data(pa_dat2) mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " fit <- sem(mod, pa_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
library(lavaan) data(pa_dat2) mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " fit <- sem(mod, pa_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
Converts a vector of lavaan syntax to the ids of parameters in the vector of free parameters or the row numbers in the parameter table.
pars_id(pars, fit, where = c("coef", "partable"), free_only = TRUE)
pars_id(pars, fit, where = c("coef", "partable"), free_only = TRUE)
pars |
A character vector of parameters specified
in lavaan syntax, e.g., |
fit |
A |
where |
Where the values are to be found. Can be "partable" (parameter table) or "coef" (coefficient vector). Default is "coef". |
free_only |
Whether only free parameters will be
kept. Default is |
It supports the following ways to specify the parameters to be included.
lavaan
syntax
For example, "y ~ x"
denotes the regression coefficient
regression y
on x
. It uses lavaan::lavaanify()
to
parse the syntax strings.
Operator
For example, "~"
denotes all regression coefficients.
It also supports :=
, which can be used to select
user-defined parameters.
Label
For example, "ab"
denotes all parameters with this
labels defined in model syntax. It can be used to
select user-defined parameters, such as "ab := a*b"
.
It is used by functions such as est_change()
.
If a model has more than one group, a specification specified as in a single sample model denotes the same parameters in all group.
For example, "f1 =~ x2"
denotes the factor loading of
x2
on f1
in all groups. "~~"
denotes covariances
and error covariances in all groups.
There are two ways to select parameters only in selected
groups. First, the syntax to fix parameter values
can be used, with NA
denoting parameters to be selected.
For example, "f2 =~ c(NA, 1, NA) * x5"
selects the
factor loadings of x5
on f2
in the first and third
groups.
Users can also add ".grouplabel" to a specification,
grouplabel
being the group label of a group (the one
appears in summary()
, not the one of the form ".g2"
,
"g3"
, etc.).
For example, "f2 =~ x5.Alpha"
denotes the factor loading
of x5
on f2
in the group "Alpha"
.
This method can be used for operators. For example,
"=~.Alpha"
denotes all factors loadings in the
group "Alpha"
.
Though not recommended, users can use labels such as
".g2"
and ".g3"
to denote the parameter in a specific
group. These are the labels appear in the output of
some functions of lavaan
. Although lavaan
does not label
the parameters in the first group by ".g1"
, this can
still be used in pars_id()
.
For example, "f2 =~ x5.g2"
denotes the factor loading
of x5
on f2
in the second group. "y ~ x.g1"
denotes the regression coefficient from x
to y
in the first group.
This method can also be used for operators. For example,
"=~.g2"
denotes all factors loadings in the
second group.
However, this method is not
as reliable as using grouplabel
because the numbering
of groups depends on the order they appear in the data
set.
A numeric vector of the ids. If where
is "partable"
,
the ids are row numbers. If where
is "coef"
,
the ids are the positions in the vector.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448
dat <- sem_dat library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) coef(fit_ng)[tmp] tmp <- pars_id(pars, fit = fit_ng, where = "partable") parameterTable(fit_ng)[tmp, ] # Multiple-group models dat <- sem_dat set.seed(64264) dat$gp <- sample(c("Alpha", "Beta", "Gamma"), nrow(dat), replace = TRUE) library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) fit_gp <- sem(sem_model, dat, group = "gp") pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) coef(fit_ng)[tmp] tmp <- pars_id(pars, fit = fit_ng, where = "partable") parameterTable(fit_ng)[tmp, ] pars <- c("f1 =~ x2", "f2 =~ c(NA, 1, NA) * x5") tmp <- pars_id(pars, fit = fit_gp) coef(fit_gp)[tmp] tmp <- pars_id(pars, fit = fit_gp, where = "partable") parameterTable(fit_gp)[tmp, ] pars2 <- c("f1 =~ x2", "~~.Beta", "f2 =~ x5.Gamma") tmp <- pars_id(pars2, fit = fit_gp) coef(fit_gp)[tmp] tmp <- pars_id(pars2, fit = fit_gp, where = "partable") parameterTable(fit_gp)[tmp, ] # Note that group 1 is "Beta", not "Alpha" lavInspect(fit_gp, "group.label")
dat <- sem_dat library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) coef(fit_ng)[tmp] tmp <- pars_id(pars, fit = fit_ng, where = "partable") parameterTable(fit_ng)[tmp, ] # Multiple-group models dat <- sem_dat set.seed(64264) dat$gp <- sample(c("Alpha", "Beta", "Gamma"), nrow(dat), replace = TRUE) library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) fit_gp <- sem(sem_model, dat, group = "gp") pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) coef(fit_ng)[tmp] tmp <- pars_id(pars, fit = fit_ng, where = "partable") parameterTable(fit_ng)[tmp, ] pars <- c("f1 =~ x2", "f2 =~ c(NA, 1, NA) * x5") tmp <- pars_id(pars, fit = fit_gp) coef(fit_gp)[tmp] tmp <- pars_id(pars, fit = fit_gp, where = "partable") parameterTable(fit_gp)[tmp, ] pars2 <- c("f1 =~ x2", "~~.Beta", "f2 =~ x5.Gamma") tmp <- pars_id(pars2, fit = fit_gp) coef(fit_gp)[tmp] tmp <- pars_id(pars2, fit = fit_gp, where = "partable") parameterTable(fit_gp)[tmp, ] # Note that group 1 is "Beta", not "Alpha" lavInspect(fit_gp, "group.label")
Converts id numbers generated by pars_id()
to values that can be used to extract the parameters
from a source.
pars_id_to_lorg(pars_id, pars_source, type = c("free", "all"))
pars_id_to_lorg(pars_id, pars_source, type = c("free", "all"))
pars_id |
A vector of integers. Usually the output of pars_id. |
pars_source |
Can be the output of
|
type |
The meaning of the values in |
If the source is a parameter estimates table (i.e.,
the output of lavaan::parameterEstimates()
, it returns
a data frame with columns "lhs", "op", and "rhs". If
"group" is present in the source, it also add a column
"group". These columns can be used to uniquely identify
the parameters specified by the ids.
If the source is a named vector of parameters (e.g., the
output of coef()
), it returns the names of parameters
based on the ids.
If pars_source
is the output of
lavaan::parameterEstimates()
or
lavaan::parameterTable()
, it returns a subset of
pars_source
, keeping the rows of selected parameters
and the columns lhs
, op
, rhs
, and group
. If
pars_source
is a named vector of free parameters, it
returns a character vector containing the names of the
selected parameters.
dat <- sem_dat set.seed(64264) library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) pars_id_to_lorg(tmp, pars_source = coef(fit_ng)) tmp <- pars_id(pars, fit = fit_ng, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_ng)) # Multiple-group models dat$gp <- sample(c("Alpha", "Beta", "Gamma"), nrow(dat), replace = TRUE) fit_gp <- sem(sem_model, dat, group = "gp") pars <- c("f1 =~ x2", "f2 =~ c(NA, 1, NA) * x5") tmp <- pars_id(pars, fit = fit_gp) pars_id_to_lorg(tmp, pars_source = coef(fit_gp)) tmp <- pars_id(pars, fit = fit_gp, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_gp)) parameterTable(fit_gp)[tmp, ] pars2 <- c("f1 =~ x2", "~~.Beta", "f2 =~ x5.Gamma") tmp <- pars_id(pars2, fit = fit_gp) pars_id_to_lorg(tmp, pars_source = coef(fit_gp)) tmp <- pars_id(pars2, fit = fit_gp, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_gp)) # Note that group 1 is "Beta", not "Alpha" lavInspect(fit_gp, "group.label")
dat <- sem_dat set.seed(64264) library(lavaan) sem_model <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ f1 f3 ~ f2 " fit_ng <- sem(sem_model, dat) pars <- c("f1 =~ x2", "f2 =~ x5", "f2 ~ f1") tmp <- pars_id(pars, fit = fit_ng) pars_id_to_lorg(tmp, pars_source = coef(fit_ng)) tmp <- pars_id(pars, fit = fit_ng, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_ng)) # Multiple-group models dat$gp <- sample(c("Alpha", "Beta", "Gamma"), nrow(dat), replace = TRUE) fit_gp <- sem(sem_model, dat, group = "gp") pars <- c("f1 =~ x2", "f2 =~ c(NA, 1, NA) * x5") tmp <- pars_id(pars, fit = fit_gp) pars_id_to_lorg(tmp, pars_source = coef(fit_gp)) tmp <- pars_id(pars, fit = fit_gp, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_gp)) parameterTable(fit_gp)[tmp, ] pars2 <- c("f1 =~ x2", "~~.Beta", "f2 =~ x5.Gamma") tmp <- pars_id(pars2, fit = fit_gp) pars_id_to_lorg(tmp, pars_source = coef(fit_gp)) tmp <- pars_id(pars2, fit = fit_gp, where = "partable") pars_id_to_lorg(tmp, pars_source = parameterEstimates(fit_gp)) # Note that group 1 is "Beta", not "Alpha" lavInspect(fit_gp, "group.label")
Print the content of an 'est_change'-class object.
## S3 method for class 'est_change' print(x, digits = 3, first = 10, sort_by = c("gcd", "est"), ...)
## S3 method for class 'est_change' print(x, digits = 3, first = 10, sort_by = c("gcd", "est"), ...)
x |
An 'est_change'-class object. |
digits |
The number of digits after the decimal. Default is 3. |
first |
Numeric. If not |
sort_by |
String. Should be |
... |
Other arguments. They will be ignored. |
All the functions on case influence
on parameter estimates, est_change()
,
est_change_approx()
, est_change_raw()
,
and est_change_raw_approx()
, return
an est_change
-class object. This method will print
the output based on the type of changes and method
used.
x
is returned invisibly. Called for its side effect.
est_change_raw()
, est_change_raw_approx()
,
est_change()
, est_change_approx()
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate case influence out <- est_change_approx(fit) out print(out, sort_by = "est") out <- est_change_raw_approx(fit) print(out, first = 3) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) est_change(fit_rerun) est_change_raw(fit_rerun)
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Approximate case influence out <- est_change_approx(fit) out print(out, sort_by = "est") out <- est_change_raw_approx(fit) print(out, first = 3) # Examine four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7)) est_change(fit_rerun) est_change_raw(fit_rerun)
Print the content of a 'fit_measures_change'-class object.
## S3 method for class 'fit_measures_change' print( x, digits = 3, first = 10, sort_by = NULL, decreasing = TRUE, absolute = TRUE, ... )
## S3 method for class 'fit_measures_change' print( x, digits = 3, first = 10, sort_by = NULL, decreasing = TRUE, absolute = TRUE, ... )
x |
An 'fit_measures_change'-class object. |
digits |
The number of digits after the decimal. Default is 3. |
first |
Numeric. If not |
sort_by |
String. Default is |
decreasing |
Logical. Whether cases, if sorted,
is on decreasing order. Default is |
absolute |
Logical. Whether cases, if sorted,
are sorted on absolute values. Default is |
... |
Other arguments. They will be ignored. |
All the functions on case influence
on fit measures, fit_measures_change()
and fit_measures_change_approx()
, return
an fit_measures_change
-class object. This method will print
the output, with the option to sort the cases.
x
is returned invisibly. Called for its side effect.
fit_measures_change()
, fit_measures_change_approx()
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Case influence out <- fit_measures_change_approx(fit) out print(out, sort_by = "chisq", first = 5) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7))#' out <- fit_measures_change(fit_rerun) out print(out, sort_by = "chisq", first = 5)
library(lavaan) # A path model dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Case influence out <- fit_measures_change_approx(fit) out print(out, sort_by = "chisq", first = 5) fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 3, 5, 7))#' out <- fit_measures_change(fit_rerun) out print(out, sort_by = "chisq", first = 5)
Print the content of an 'influence_stat'-class object.
## S3 method for class 'influence_stat' print( x, digits = 3, what = c("parameters", "fit_measures", "mahalanobis"), first = 10, sort_parameters_by = c("gcd", "est"), sort_fit_measures_by = NULL, sort_mahalanobis = TRUE, sort_fit_measures_decreasing = TRUE, sort_fit_measures_on_absolute = TRUE, sort_mahalanobis_decreasing = TRUE, ... )
## S3 method for class 'influence_stat' print( x, digits = 3, what = c("parameters", "fit_measures", "mahalanobis"), first = 10, sort_parameters_by = c("gcd", "est"), sort_fit_measures_by = NULL, sort_mahalanobis = TRUE, sort_fit_measures_decreasing = TRUE, sort_fit_measures_on_absolute = TRUE, sort_mahalanobis_decreasing = TRUE, ... )
x |
An 'influence_stat'-class object. |
digits |
The number of digits after the decimal. Default is 3. |
what |
A character vector of the
results #' to be printed, can be
one or more of the following:
|
first |
Numeric. If not |
sort_parameters_by |
String.
If it is |
sort_fit_measures_by |
String. Default is |
sort_mahalanobis |
Logical. If |
sort_fit_measures_decreasing |
Logical. Whether cases, if sorted
on fit measures,
are on decreasing order in the output of
case influence on fit measures. Default is |
sort_fit_measures_on_absolute |
Logical. Whether
cases, if sorted on fit measures,
are sorted on absolute values of fit measures. Default is |
sort_mahalanobis_decreasing |
Logical. Whether cases, if sorted
on Mahalanobis distance,
is on decreasing order. Default is |
... |
Optional arguments. Passed to
other print methods, such as |
This method will print
the output of influence_stat()
in a user-friendly
way. Users can select the set(s) of output,
case influence on parameter estimates,
case influence on fit measures, and
Mahalanobis distance, to be printed.
The corresponding print methods of
est_change
-class objects,
fit_measures_change
-class objects,
and md_semfindr
-class objects will be called.
x
is returned invisibly. Called for its side effect.
influence_stat()
, print.est_change()
,
print.fit_measures_change()
, print.md_semfindr()
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) out print(out, first = 4) print(out, what = c("parameters", "fit_measures")) # --- Approximate Approach out_approx <- influence_stat(fit) out_approx print(out, first = 8) print(out, what = c("parameters", "fit_measures"), sort_parameters_by = "est")
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # --- Leave-One-Out Approach # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Get all default influence stats out <- influence_stat(fit_rerun) out print(out, first = 4) print(out, what = c("parameters", "fit_measures")) # --- Approximate Approach out_approx <- influence_stat(fit) out_approx print(out, first = 8) print(out, what = c("parameters", "fit_measures"), sort_parameters_by = "est")
Prints the results of lavaan_rerun()
.
## S3 method for class 'lavaan_rerun' print(x, ...)
## S3 method for class 'lavaan_rerun' print(x, ...)
x |
The output of |
... |
Other arguments. They will be ignored. |
x
is returned invisibly. Called for its side effect.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448
library(lavaan) dat <- pa_dat # For illustration only, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. fit_rerun <- lavaan_rerun(fit, parallel = FALSE) fit_rerun
library(lavaan) dat <- pa_dat # For illustration only, select only the first 50 cases dat <- dat[1:50, ] # The model mod <- " m1 ~ iv1 + iv2 dv ~ m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. fit_rerun <- lavaan_rerun(fit, parallel = FALSE) fit_rerun
Print the content of a 'md_semfindr'-class object.
## S3 method for class 'md_semfindr' print(x, digits = 3, first = 10, sort = TRUE, decreasing = TRUE, ...)
## S3 method for class 'md_semfindr' print(x, digits = 3, first = 10, sort = TRUE, decreasing = TRUE, ...)
x |
An 'md_semfindr'-class object. |
digits |
The number of digits after the decimal. Default is 3. |
first |
Numeric. If not |
sort |
Logical. If |
decreasing |
Logical. Whether cases, if sorted,
is on decreasing order. Default is |
... |
Other arguments. They will be ignored. |
The print method for the
'md_semfindr'-class object, returned by
mahalanobis_rerun()
or mahalanobis_predictors()
.
This method will print
the output with the option to sort the cases.
x
is returned invisibly. Called for its side effect.
mahalanobis_rerun()
, mahalanobis_predictors()
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the Mahalanobis distance for each case out <- mahalanobis_rerun(fit_rerun) out print(out, first = 3)
library(lavaan) dat <- pa_dat # The model mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 " # Fit the model fit <- lavaan::sem(mod, dat) summary(fit) # Fit the model n times. Each time with one case removed. # For illustration, do this only for selected cases. fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = 1:10) # Compute the Mahalanobis distance for each case out <- mahalanobis_rerun(fit_rerun) out print(out, first = 3)
A nine-variable dataset with 200 cases.
sem_dat
sem_dat
A data frame with 200 rows and 9 variables:
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
library(lavaan) data(sem_dat) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " fit <- sem(mod, sem_dat) summary(fit)
library(lavaan) data(sem_dat) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " fit <- sem(mod, sem_dat) summary(fit)
A ten-variable dataset with 200 cases, with one influential case.
sem_dat2
sem_dat2
A data frame with 200 rows and 10 variables:
Case ID. Character.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
Indicator. Numeric.
library(lavaan) data(sem_dat2) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " fit <- sem(mod, sem_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
library(lavaan) data(sem_dat2) mod <- " f1 =~ x1 + x2 + x3 f2 =~ x4 + x5 + x6 f3 =~ x7 + x8 + x9 f2 ~ a * f1 f3 ~ b * f2 ab := a * b " fit <- sem(mod, sem_dat2) summary(fit) inf_out <- influence_stat(fit) gcd_plot(inf_out)
Gets a lavaan_rerun()
output and computes the
changes in user-defined statistics for each case if included.
user_change_raw(rerun_out, user_function = NULL, ..., fit_name = NULL)
user_change_raw(rerun_out, user_function = NULL, ..., fit_name = NULL)
rerun_out |
The output from |
user_function |
A function that accepts a
|
... |
Optional arguments to be
passed to |
fit_name |
If the function does
not accept the |
For each case, user_change_raw()
computes the differences
in user-defined statistics with and without this
case:
(User statistics with all case) - (User statistics without this case).
The change is the raw change. The change is not divided by standard error. This is a measure of the influence of a case on the use-defined statistics if it is included.
If the value of a case is positive, including the case increases a statistic.
If the value of a case is negative, including the case decreases a statistic.
The user-defined statistics are computed by a user-supplied
function, user_function
. It must return a
vector-like object (which can have only one value).
If the vector is not named, names will be created
as user_1
, user_2
, and so on.
An est_change
-class object, which is a
matrix with the number of columns equals to the number of
values returned by user_function
when computed in one
lavaan
-class object, and the number of rows equals to
the number of cases. The row names are the case
identification values used in
lavaan_rerun()
. The elements are the raw differences.
A print method is available for user-friendly output.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
Pek, J., & MacCallum, R. (2011). Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behavioral Research, 46(2), 202-228. doi:10.1080/00273171.2011.561068
# A path model library(lavaan) dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- sem(mod, dat) summary(fit) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 4, 7, 9)) # Get the R-squares lavInspect(fit, what = "rsquare") out <- user_change_raw(fit_rerun, user_function = lavInspect, what = "rsquare") out # Index plot p <- index_plot(out, column = "dv", plot_title = "R-square: dv") p
# A path model library(lavaan) dat <- pa_dat mod <- " m1 ~ a1 * iv1 + a2 * iv2 dv ~ b * m1 a1b := a1 * b a2b := a2 * b " # Fit the model fit <- sem(mod, dat) summary(fit) # Fit the model several times. Each time with one case removed. # For illustration, do this only for four selected cases fit_rerun <- lavaan_rerun(fit, parallel = FALSE, to_rerun = c(2, 4, 7, 9)) # Get the R-squares lavInspect(fit, what = "rsquare") out <- user_change_raw(fit_rerun, user_function = lavInspect, what = "rsquare") out # Index plot p <- index_plot(out, column = "dv", plot_title = "R-square: dv") p