| Title: | Power Analysis for Moderation and Mediation |
|---|---|
| Description: | Power analysis and sample size determination for moderation, mediation, and moderated mediation in models fitted by structural equation modelling using the 'lavaan' package by Rosseel (2012) <doi:10.18637/jss.v048.i02> or by multiple regression. The package 'manymome' by Cheung and Cheung (2024) <doi:10.3758/s13428-023-02224-z> is used to specify the indirect paths or conditional indirect paths to be tested. |
| Authors: | Shu Fai Cheung [aut, cre] (ORCID: <https://orcid.org/0000-0002-9871-9448>), Sing-Hang Cheung [aut] (ORCID: <https://orcid.org/0000-0001-5182-0752>), Wendie Yang [aut] (ORCID: <https://orcid.org/0009-0000-8388-6481>) |
| Maintainer: | Shu Fai Cheung <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.1.3 |
| Built: | 2026-06-08 10:41:44 UTC |
| Source: | https://github.com/sfcheung/power4mome |
Helpers for the method by Boos and Zhang (2000) for estimating rejection rate for resampling-based tests.
Rs_bz_supported( alpha = 0.05, Rmax = getOption("power4mome.bz_Rmax", default = 359) ) R_for_bz(R_target, alpha = 0.05)Rs_bz_supported( alpha = 0.05, Rmax = getOption("power4mome.bz_Rmax", default = 359) ) R_for_bz(R_target, alpha = 0.05)
alpha |
The level of significance, two-tailed. |
Rmax |
The maximum number of resamples to be returned. Default is 359. Though it is possible to use 1999 resamples or even more with Boos-Zhang-2000, using such a large number of resamples defeats the goal to reduce processing time. |
R_target |
The target maximum number of resamples. |
Boos and Zhang (2000) proposed
a method to estimate the rejection
rate (power, if the null hypothesis is false)
for methods based on resampling, such
as nonparametric bootstrapping. This
method is used by some functions
in power4mome, such as x_from_power().
This method is implemented internally. Some helper functions regarding this method is exported for users.
The function Rs_bz_supported() returns
the number of bootstrap samples (for bootstrapping)
or simulated samples (for Monte Carlo),
both called resamples below for brevity,
usually specified by the argument R,
that can be
used for the Boos-Zhang-2000 method,
given the desired two-tailed level of significance,
(.05 by default).
If possible, setting the number of resamples.
(e.g., setting R when calling x_from_power())
will automatically enable the Boos-Zhang-2000
method (unless explicitly turned off
by setting the option "power4mome.bz"
to FALSE by options("power4mome.bz") <- FALSE),
substantially reducing the processing
time. For now, only two-tailed tests are supported.
Given a target maximum number of
resamples and a level of significance,
the function R_for_bz() returns
largest number of resamples that can be
used for the Boos-Zhang-2000 method.
This function can be used for
arguments such as R in x_from_power()
to automatically find the largest
value supported by the Boos-Zhang-2000
method.
The function Rs_bz_supported() returns
a numeric vector of the numbers of
resamples supported.
The function R_for_bz() returns
a scalar.
Boos, D. D., & Zhang, J. (2000). Monte Carlo evaluation of resampling-based hypothesis tests. Journal of the American Statistical Association, 95(450), 486–492. doi:10.1080/01621459.2000.10474226
# === Rs_bz_supported === # alpha = .05 Rs_bz_supported() # alpha = .01 Rs_bz_supported(alpha = .01) # === R_for_bz === R_for_bz(200) R_for_bz(500)# === Rs_bz_supported === # alpha = .05 Rs_bz_supported() # alpha = .01 Rs_bz_supported(alpha = .01) # === R_for_bz === R_for_bz(200) R_for_bz(500)
Do a test on each
replication in the output of
sim_out().
do_test( sim_all, test_fun, test_args = list(), map_names = c(fit = "fit"), results_fun = NULL, results_args = list(), parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), cl = NULL )do_test( sim_all, test_fun, test_args = list(), map_names = c(fit = "fit"), results_fun = NULL, results_args = list(), parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), cl = NULL )
sim_all |
The output of
|
test_fun |
A function to do the
test. See 'Details' for the requirement
of this function. There are some
built-in test functions in |
test_args |
A list of arguments
to be passed to the |
map_names |
A named character
vector specifying how the content of
the element |
results_fun |
The function to be
used to extract the test results.
See |
results_args |
A list of
arguments to be passed to the
|
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
cl |
A cluster, such as one created
by |
The function do_test() does
an arbitrary test in each
replication using the function set to
test_fun.
An object of the class test_out,
which is a list of length equal to
sim_all, one element for each
replication. Each element of the list
has two elements:
test: The output of the function
set to test_fun.
test_results: The output of
the function set to results_fun.
do_test()
The function do_test() is used by
the all-in-one function
power4test(). Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
The function set to test_fun,
the test function, usually
should work
on the output of lavaan::sem(),
lmhelprs::many_lm(), or
stats::lm(), but can also be a
function that works on the output
of the function set to fit_function
when calling fit_model() or
power4test() (see fit_model_args).
The function has two default requirements.
First, it has an argument fit, to
be set to the output of
lavaan::sem() or another output
stored in the element extra$fit of
a replication in the sim_all
object. (This requirement can be
relaxed; see the section on map_names.)
That is, the function definition
should be of this from: FUN(fit, ...). This is the form of all
test_* functions in power4mome.
If other arguments are to be passed
to the test function, they can be set
to test_args as a named list.
Second, the test function must returns an output that (a) can be processed by the results function (see below), or (b) is of the required format for the output of a results function (see the next section). If it already returns an output of the required format, then there is no need to set the results function.
The test results will be extracted
from the output of test_fun by the
function set to results_fun,
the results function. If
the test_fun already returns an
output of the expected format
(see below), then set results_fun
to NULL, the default. The output
of test_fun will be used for
estimating power.
The function set to results_fun
must accept the output of test_fun,
as the first argument, and return a
named list (which can be a data frame)
or a named vector with some
of the following
elements:
est: Optional. The estimate of a
parameter, if applicable.
se: Optional. The standard error
of the estimate, if applicable.
cilo: Optional. The lower limit of the
confidence interval, if applicable.
cihi: Optional. The upper limit of the
confidence interval, if applicable.
sig: Required. If 1, the test is
significant. If 0, the test is not
significant. If the test cannot be
done for any reason, it should be
NA.
The results can then be used to estimate the power or Type I error of the test.
For example, if
the null hypothesis is false, then
the proportion of significant, that
is, the mean of the values of sig
across replications, is the power.
The package power4mome has some ready-to-use
test functions:
Please refer to their help pages for examples.
This argument is for developers using
a test function that has a different
name for the argument of the fit
object ("fit", the default).
If test_fun is set to a function
that works on an output of, say,
lavaan::sem() but the argument name
for the output is not fit, the
mapping can be changed by
map_names.
For example,
lavaan::parameterEstimates()
receives an output of lavaan::sem()
and reports the test results of model
parameters. However, the argument
name for the lavaan output is
object. To instruct do_test() to
do the test correctly when setting
test_fun to
lavaan::parameterEstimates, add
map_names = c(object = "fit"). The
element fit in a replication will
then set to the argument object of
lavaan::parameterEstimates().
results_fun argumentIn the early development of
power4mome, test_fun is designed
to accept existing functions from
other packages, such as
manymome::indirect_effect(). Their
outputs are not of the required
format for power analysis, and so
results functions are needed to
process their outputs. In the current
version of power4mome, some
ready-to-use test functions, usually
wrappers of these existing functions
from other packages, have been
developed, and they no longer need
results functions to process the
output. The argument results_fun is
kept for backward compatibility and
advanced users to use test functions
from other packages.
See power4test() for
the all-in-one function that uses
this function. See test_indirect_effect(),
test_cond_indirect(),
test_cond_indirect_effects(),
test_moderation(),
test_index_of_mome(), and
test_parameters() for examples of
test functions.
# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate Monte Carlo estimates for forming Monte Carlo confidence intervals mc_all <- gen_mc(fit_all, R = 50, iseed = 4567) # Combine the results to a 'sim_all' object sim_all <- sim_out(data_all = data_all, fit = fit_all, mc_out = mc_all) # Test the indirect effect in each replication # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis test_all <- do_test(sim_all, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", mc_ci = TRUE), parallel = FALSE, progress = FALSE) # The result for each dataset lapply(test_all, function(x) x$test_results)# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate Monte Carlo estimates for forming Monte Carlo confidence intervals mc_all <- gen_mc(fit_all, R = 50, iseed = 4567) # Combine the results to a 'sim_all' object sim_all <- sim_out(data_all = data_all, fit = fit_all, mc_out = mc_all) # Test the indirect effect in each replication # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis test_all <- do_test(sim_all, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", mc_ci = TRUE), parallel = FALSE, progress = FALSE) # The result for each dataset lapply(test_all, function(x) x$test_results)
Get the output of
sim_data() and fit a model to each
of the stored datasets.
fit_model( data_all = NULL, model = NULL, fit_function = "lavaan", arg_data_name = "data", arg_model_name = "model", arg_group_name = "group", ..., fit_out = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), cl = NULL )fit_model( data_all = NULL, model = NULL, fit_function = "lavaan", arg_data_name = "data", arg_model_name = "model", arg_group_name = "group", ..., fit_out = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), cl = NULL )
data_all |
The output
of |
model |
The model to be fitted.
If |
fit_function |
The function to
be used to fit the model. Can also
be a string: |
arg_data_name |
The name of the
argument of |
arg_model_name |
The name of
the argument of |
arg_group_name |
The name of
the argument of |
... |
Optional arguments to be
passed to |
fit_out |
If set to a |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
cl |
A cluster, such as one created
by |
By default, the function fit_model()
extracts the model
stored in the output of sim_data(),
fits the model to each dataset
simulated using fit_function,
default to "lavaan" and
lavaan::sem() will be called,
and returns the results.
If the datasets
were generated from a multigroup
model when calling sim_data(),
a multigroup model is fitted.
An object of the class fit_out,
which is a list of the output of
fit_function (lavaan::sem()
by default). If an error occurred
when fitting the model to a dataset,
then this element will be the error
message from the fit function.
fit_model()
This function is used by the
all-in-one function power4test().
Users usually do not call this
function directly, though
developers can use this function to
customize the model fitting step in
power analysis.
See power4test() for
the all-in-one function that uses
this function, and sim_data()
for the function generating datasets
for this function.
# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) fit_all[[1]] # Fit the population model using the MLR estimator fit_all_mlr <- fit_model(data_all, estimator = "MLR") fit_all_mlr[[1]] # Fit a model different from the population model, # with the MLR estimator mod2 <- "m ~ x y ~ m" fit_all_mlr2 <- fit_model(data_all, mod2, estimator = "MLR") fit_all_mlr2[[1]]# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) fit_all[[1]] # Fit the population model using the MLR estimator fit_all_mlr <- fit_model(data_all, estimator = "MLR") fit_all_mlr[[1]] # Fit a model different from the population model, # with the MLR estimator mod2 <- "m ~ x y ~ m" fit_all_mlr2 <- fit_model(data_all, mod2, estimator = "MLR") fit_all_mlr2[[1]]
Get a list of the output
of lavaan::sem() or lmhelprs::many_lm()
and generate
bootstrap estimates of model
parameters.
gen_boot( fit_all, R = 100, ..., iseed = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), compute_implied_stats = FALSE, cl = NULL )gen_boot( fit_all, R = 100, ..., iseed = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), compute_implied_stats = FALSE, cl = NULL )
fit_all |
The output of
|
R |
The number of replications to generate the bootstrap estimates for each fit output. |
... |
Optional arguments to be
passed to |
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
compute_implied_stats |
Whether
implied statistics are computed
in each bootstrap samples. Usually
not needed and so default to |
cl |
A cluster, such as one created
by |
The function gen_boot()
simply calls manymome::do_boot()
on each output of
lavaan::sem() or lmhelprs::many_lm()
in fit_all. The
simulated
estimates can then be used to test
effects such as indirect effects,
usually by functions from the
manymome package, such as
manymome::indirect_effect().
A boot_list object, which is a list
of the output of manymome::do_boot().
gen_boot()
This function is used by the
all-in-one function power4test().
Users usually do not call this
function directly, though
developers can use this function to
customize the workflow of the
power analysis.
See power4test() for
the all-in-one function that uses
this function.
# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 2, model = mod, pop_es = es, n = 50, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate bootstrap estimates for each replication boot_all <- gen_boot(fit_all, R = 10, iseed = 4567) boot_all# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 2, model = mod, pop_es = es, n = 50, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate bootstrap estimates for each replication boot_all <- gen_boot(fit_all, R = 10, iseed = 4567) boot_all
Get a list of the output
of lavaan::sem() and generate
Monte Carlo estimates of model
parameters.
gen_mc( fit_all, R = 100, ..., iseed = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), compute_implied_stats = FALSE, cl = NULL )gen_mc( fit_all, R = 100, ..., iseed = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), compute_implied_stats = FALSE, cl = NULL )
fit_all |
The output of
|
R |
The number of replications to generate the Monte Carlo estimates for each fit output. |
... |
Optional arguments to be
passed to |
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
compute_implied_stats |
Whether
implied statistics are computed
in each Monte Carlo replication. Usually
not needed and so default to |
cl |
A cluster, such as one created
by |
The function gen_mc() simply calls
manymome::do_mc()
on each output of
lavaan::sem() in fit_all. The
simulated
estimates can then be used to test
effects such as indirect effects,
usually by functions from the
manymome package, such as
manymome::indirect_effect().
An mc_list object, which is a list
of the output of manymome::do_mc().
gen_mc()
This function is used by the
all-in-one function power4test().
Users usually do not call this
function directly, though
developers can use this function to
customize the workflow of the
power analysis.
See power4test() for
the all-in-one function that uses
this function.
# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate Monte Carlo estimates for each replication mc_all <- gen_mc(fit_all, R = 100, iseed = 4567) mc_all# Specify the population model mod <- "m ~ x y ~ m + x" # Specify the effect sizes (population parameter values) es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate several simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each datasets fit_all <- fit_model(data_all) # Generate Monte Carlo estimates for each replication mc_all <- gen_mc(fit_all, R = 100, iseed = 4567) mc_all
For the process_data
argument. It introduces missing
values in the generated data.
missing_values(data, ..., prop = 0.5, mech = "MCAR")missing_values(data, ..., prop = 0.5, mech = "MCAR")
data |
A data frame. |
... |
Optional arguments to be
passed to |
prop |
The proportion of missingness. Default is 0.5, about 50% of the cases have missing data. |
mech |
The missing data mechanism.
Default is |
This function is to be used in
the process_data argument of
power4test().
The missing values are generated by
mice::ampute(). Please refer to
its help page, the vignette for
this function:
Generate missing values with ampute,
and Schouten, Lugtig, and Vink (2018).
In addition to data, only two
arguments are explicitly defined for
missing_values(): prop and mech.
The argument prop is defined to remind
users of the default value for this
argument. The argument mech, specifying
the missing data mechanism, is
defined because its default value
is different from that of mice::ampute().
For missing_values(), the default value
is "MCAR", missing completely at
random.
Please refer to the help page of
mice::ampute() for other available
arguments.
patterns and other arguments with variable namesIf arguments such as patterns are
specified, the variable names need to be
those in the generated data. If
indicator scores are generated, the
names need to be the names of the indicators,
not the names of the latent variables.
It returns a data frame with the missing values inserted.
Schouten, R. M., Lugtig, P., & Vink, G. (2018). Generating missing values for simulation purposes: A multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88(15), 2909–2930. doi:10.1080/00949655.2018.1491577
power4test(). See also
mice::ampute() for the amputation
method.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, process_data = list(fun = "missing_values", args = list(prop = .75)), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat, 50)# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, process_data = list(fun = "missing_values", args = list(prop = .75)), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat, 50)
For the process_data
argument. It converts continuous
indicator variable to ordinal variables.
ordinal_variables(data, cut_patterns = NULL, cuts = NULL) cut_patterns(which = NULL)ordinal_variables(data, cut_patterns = NULL, cuts = NULL) cut_patterns(which = NULL)
data |
A data frame. |
cut_patterns |
A named vector.
The names are the names of the latent
variables for which indicator scores
will be converted. Each value must be
the name of one of the built-in
patterns (call |
cuts |
A named list.
The names are the names of the latent
variables for which indicator scores
will be converted.
Each element is a vector of the
thresholds for the conversion. |
which |
A name of a built-in pattern. |
This function is to be used in
the process_data argument of
power4test().
It is used for converting the continuous indicator scores generated to ordinal scores (with values 1, 2, 3, etc.).
For example, if the cut points (thresholds) are -2 and 2, then sores will be converted to three categories: (-Inf to -2], (-2 to 2], and (2 to Inf]. The intervals are closed on the right. That is, a score of -2 is in the interval (-Inf to -2], not in the interval (-2 to 2).
The conversion is implemented by cut().
There are two ways to specify the
conversion. The first one is to use
some built-in thresholds, based on
Savalei and Rhemtulla (2013), Table 1.
Call cut_patterns() with no argument
to list all the built-in patterns
and their names.
Alternatively, users can specify the
thresholds directly through the argument
cuts.
Currently, all indicators of a latent variable must be converted in the same way.
The function ordinal_variables()
returns a data frame with the
the converted scores.
The function cut_patterns()
returns a named list of the built-in
patterns if which = NULL, and
a numeric vector of the thresholds
of a built-in pattern if which is
set to one of the names of the built-in
patterns.
Savalei, V., & Rhemtulla, M. (2013). The performance of robust test statistics with categorical data. British Journal of Mathematical and Statistical Psychology, 66(2), 201–223. doi:10.1111/j.2044-8317.2012.02049.x
power4test(). See also
cut() for the implementation.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Specify the numbers of indicators and reliability coefficients k <- c(y = 3, m = 4, x = 5) rel <- c(y = .70, m = .70, x = .70) # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, number_of_indicators = k, reliability = rel, process_data = list(fun = "ordinal_variables", args = list(cut_patterns = c(x = "s3"), cuts = list(m = c(-2, 0, 2)))), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat, 50) # The built-in patterns cut_patterns()# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Specify the numbers of indicators and reliability coefficients k <- c(y = 3, m = 4, x = 5) rel <- c(y = .70, m = .70, x = .70) # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, number_of_indicators = k, reliability = rel, process_data = list(fun = "ordinal_variables", args = list(cut_patterns = c(x = "s3"), cuts = list(m = c(-2, 0, 2)))), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat, 50) # The built-in patterns cut_patterns()
Plotting the results
in a 'power_curve' object, such as the
estimated power against sample size,
or the results of power4test_by_n()
or power4test_by_es().
## S3 method for class 'power_curve' plot( x, what = c("ci", "power_curve"), main = paste0("Power Curve ", "(Predictor: ", switch(x$predictor, n = "Sample Size", es = "Effect Size"), ")"), xlab = switch(x$predictor, n = "Sample Size", es = "Effect Size"), ylab = "Estimated Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... ) ## S3 method for class 'power4test_by_n' plot( x, what = c("ci", "power_curve"), main = "Estimated Power vs. Sample Size", xlab = "Sample Size", ylab = "Estimated Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... ) ## S3 method for class 'power4test_by_es' plot( x, what = c("ci", "power_curve"), main = paste0("Estimated Power vs. Effect Size / Parameter (", attr(x[[1]], "pop_es_name"), ")"), xlab = paste0("Effect Size / Parameter (", attr(x[[1]], "pop_es_name"), ")"), ylab = "Estiamted Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... )## S3 method for class 'power_curve' plot( x, what = c("ci", "power_curve"), main = paste0("Power Curve ", "(Predictor: ", switch(x$predictor, n = "Sample Size", es = "Effect Size"), ")"), xlab = switch(x$predictor, n = "Sample Size", es = "Effect Size"), ylab = "Estimated Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... ) ## S3 method for class 'power4test_by_n' plot( x, what = c("ci", "power_curve"), main = "Estimated Power vs. Sample Size", xlab = "Sample Size", ylab = "Estimated Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... ) ## S3 method for class 'power4test_by_es' plot( x, what = c("ci", "power_curve"), main = paste0("Estimated Power vs. Effect Size / Parameter (", attr(x[[1]], "pop_es_name"), ")"), xlab = paste0("Effect Size / Parameter (", attr(x[[1]], "pop_es_name"), ")"), ylab = "Estiamted Power", pars_ci = list(), type = "l", ylim = c(0, 1), ci_level = 0.95, ... )
x |
The object to be plotted.
It can be a |
what |
A character vector of
what to include in the
plot. Possible values are
|
main |
The title of the plot. |
xlab, ylab
|
The labels for the horizontal and vertical axes, respectively. |
pars_ci |
A named list of
arguments to be passed to |
type |
An argument of the
default plot method |
ylim |
A two-element numeric vector of the range of the vertical axis. |
ci_level |
The level of
confidence of the confidence intervals,
if requested. Default is |
... |
Optional arguments.
Passed to |
The plot method of power_curve
objects currently plots the relation
between estimated power and
the predictor. Other elements
can be requested (see the argument
what), and they can be customized
individually.
The plot-methods return x
invisibly. They
are called for their side effects.
power_curve(),
power4test_by_n(), and
power4test_by_es().
# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model sim_only <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out1 <- power4test_by_n(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 50, 100), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 plot(pout1) # By pop_es: Do a test for different population values of a model parameter # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out2 <- power4test_by_es(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), pop_es_name = "y ~ x", pop_es_values = c(0, .3, .5), by_seed = 1234, parallel = FALSE, progress = FALSE) pout2 <- power_curve(out2) plot(pout2)# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model sim_only <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out1 <- power4test_by_n(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 50, 100), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 plot(pout1) # By pop_es: Do a test for different population values of a model parameter # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out2 <- power4test_by_es(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), pop_es_name = "y ~ x", pop_es_values = c(0, .3, .5), by_seed = 1234, parallel = FALSE, progress = FALSE) pout2 <- power_curve(out2) plot(pout2)
It plots the results of 'x_from_power', such as the estimated power against sample size.
## S3 method for class 'x_from_power' plot( x, what = c("ci", "power_curve", "final_x", "final_power", "target_power", switch(x$x, n = "sig_area", es = NULL)), text_what = c("final_x", "final_power", switch(x$x, n = "sig_area", es = NULL)), digits = 3, main = NULL, xlab = NULL, ylab = "Estimated Power", ci_level = 0.95, pars_ci = list(), pars_power_curve = list(), pars_ci_final_x = list(lwd = 3, length = 0.2, col = "blue"), pars_target_power = list(lty = "dashed", lwd = 2, col = "black"), pars_final_x = list(lty = "dotted"), pars_final_power = list(lty = "dotted", col = "blue"), pars_text_final_x = list(y = 0, pos = 3, cex = 1), pars_text_final_power = list(pos = 3, cex = 1), pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)), pars_text_sig_area = list(cex = 1), prop_of_trials = NULL, min_trials_for_prop = NULL, override_for_pba = TRUE, ... ) ## S3 method for class 'n_region_from_power' plot( x, what = c("ci", "power_curve", "final_x", "final_power", "target_power", "sig_area"), text_what = c("final_x", "final_power", "sig_area"), digits = 3, main = paste0("Power Curve ", "(Target Power: ", formatC(x$below$target_power, digits = digits, format = "f"), ")"), xlab = NULL, ylab = "Estimated Power", ci_level = 0.95, pars_ci = list(), pars_power_curve = list(), pars_ci_final_x = list(lwd = 2, length = 0.2, col = "blue"), pars_target_power = list(lty = "dashed", lwd = 2, col = "black"), pars_final_x = list(lty = "dotted"), pars_final_power = list(lty = "dotted", col = "blue"), pars_text_final_x = list(pos = 3, cex = 1), pars_text_final_x_lower = pars_text_final_x, pars_text_final_x_upper = pars_text_final_x, pars_text_final_power = list(cex = 1), pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)), pars_text_sig_area = list(cex = 1), ... )## S3 method for class 'x_from_power' plot( x, what = c("ci", "power_curve", "final_x", "final_power", "target_power", switch(x$x, n = "sig_area", es = NULL)), text_what = c("final_x", "final_power", switch(x$x, n = "sig_area", es = NULL)), digits = 3, main = NULL, xlab = NULL, ylab = "Estimated Power", ci_level = 0.95, pars_ci = list(), pars_power_curve = list(), pars_ci_final_x = list(lwd = 3, length = 0.2, col = "blue"), pars_target_power = list(lty = "dashed", lwd = 2, col = "black"), pars_final_x = list(lty = "dotted"), pars_final_power = list(lty = "dotted", col = "blue"), pars_text_final_x = list(y = 0, pos = 3, cex = 1), pars_text_final_power = list(pos = 3, cex = 1), pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)), pars_text_sig_area = list(cex = 1), prop_of_trials = NULL, min_trials_for_prop = NULL, override_for_pba = TRUE, ... ) ## S3 method for class 'n_region_from_power' plot( x, what = c("ci", "power_curve", "final_x", "final_power", "target_power", "sig_area"), text_what = c("final_x", "final_power", "sig_area"), digits = 3, main = paste0("Power Curve ", "(Target Power: ", formatC(x$below$target_power, digits = digits, format = "f"), ")"), xlab = NULL, ylab = "Estimated Power", ci_level = 0.95, pars_ci = list(), pars_power_curve = list(), pars_ci_final_x = list(lwd = 2, length = 0.2, col = "blue"), pars_target_power = list(lty = "dashed", lwd = 2, col = "black"), pars_final_x = list(lty = "dotted"), pars_final_power = list(lty = "dotted", col = "blue"), pars_text_final_x = list(pos = 3, cex = 1), pars_text_final_x_lower = pars_text_final_x, pars_text_final_x_upper = pars_text_final_x, pars_text_final_power = list(cex = 1), pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)), pars_text_sig_area = list(cex = 1), ... )
x |
An |
what |
A character vector of
what to include in the
plot. Possible values are
|
text_what |
A character vector
of what numbers to be added as
labels. Possible values are
|
digits |
The number of digits after the decimal that will be used when adding numbers. |
main |
The title of the plot.
If |
xlab, ylab
|
The labels for the horizontal and vertical axes, respectively. |
ci_level |
The level of
confidence of the confidence intervals,
if requested. Default is |
pars_ci |
A named list of
arguments to be passed to |
pars_power_curve |
A named list of
arguments to be passed to |
pars_ci_final_x |
A named list of
arguments to be passed to |
pars_target_power |
A named list
of arguments to be passed to |
pars_final_x |
A
named list of arguments to be passed
to |
pars_final_power |
A
named list of arguments to be passed
to |
pars_text_final_x |
A
named list of arguments to be passed
to |
pars_text_final_power |
A
named list of arguments to be passed
to |
pars_sig_area |
A named list
of arguments to be passed to
|
pars_text_sig_area |
A named list
of arguments to be passed to
|
prop_of_trials |
The proportion
of trials to be included in the plot.
If |
min_trials_for_prop |
The minimum
number of trials for |
override_for_pba |
If |
... |
Optional arguments.
Passed to |
pars_text_final_x_lower, pars_text_final_x_upper
|
If two values of the predictor are to
be printed, these are the named list
of the arguments to be passed to |
The plot method of x_from_power
objects currently plots the relation
between estimated power and
the values examined by x_from_power().
Other elements
can be requested (see the argument
what), and they can be customized
individually.
The plot-method for
n_region_from_power objects is
a modified version of the plot-method
for x_from_power. It plots the
results of two runs of n_from_power()
in one plot. It is otherwise similar
to the plot-method for x_from_power.
The plot-method of x_from_power
returns x invisibly.
It is called for its side effect.
The plot-method of n_region_from_power
returns x invisibly.
It is called for its side effect.
# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 10, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 10, max_trials = 1, seed = 2345) plot(power_vs_n)# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 10, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 10, max_trials = 1, seed = 2345) plot(power_vs_n)
Convert a YAML string
to a vector or list of pop_es
specification.
pop_es_yaml(text)pop_es_yaml(text)
text |
The multiline string to be parsed to a specification of population values. |
The function pop_es_yaml()
allows users to specify population
values of a model using one single
string, as in 'lavaan' model syntax.
Either a named vector (for a single-group model) or a list of named vector (for a multigroup model) of the population values of the parameters (the effect sizes).
When setting the argument pop_es,
instead of using a named vector
or named list for
pop_es, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml().
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with tag:.
tag can be the name of a
parameter, in lavaan model
syntax format.
For example, m ~ x
denotes the path from x to m.
A tag in lavaan model syntax can
specify more than one parameter
using +.
For example, y ~ m + x
denotes the two paths from m and
x to y.
Alternatively, the tag can be
either .beta. or .cov..
Use .beta. to set the default
values for all regression coefficients.
Use .cov. to set the default
values for all correlations of
exogenous variables (e.g., predictors).
After each tag is the value of the population value:
-nil for nil (zero),
s for small,
m for medium, and
l for large.
si, mi, and li for
small, medium, and large a
standardized indirect effect,
respectively.
Note: n cannot be used in this mode.
The
value for each label is determined
by es1 and es2 as described
in ptable_pop().
The value can also be
set to a numeric value, such as
.30 or -.30.
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small, while
the path from m to y is large.
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed
by two or more lines. Each line
starts with a hyphen - and the
value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x to m for the two
groups are 0 (nil) and
small (s), respectively.
Another equivalent way to specify
the values are using [], on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta. and .cov. can
also be used for multigroup models.
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
The multiline string is parsed by yaml::read_yaml().
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
ptable_pop(),
power4test(), and other functions
that have the pop_es argument.
mod_es <- c("y ~ m" = "l", "m ~ x" = "m", "y ~ x" = "nil") mod_es_yaml <- " y ~ m: l m ~ x: m y ~ x: nil " pop_es_yaml(mod_es_yaml)mod_es <- c("y ~ m" = "l", "m ~ x" = "m", "y ~ x" = "nil") mod_es_yaml <- " y ~ m: l m ~ x: m y ~ x: nil " pop_es_yaml(mod_es_yaml)
Estimate the relation between power and a characteristic, such as sample size or population effect size (population value of a model parameter).
power_curve( object, formula = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_args = list(), nls_control = list(), verbose = FALSE, models = c("nls", "logistic", "lm"), nls_options = list(min_points = 4, regions = c(0, 0.45, 0.75, 0.85, 1)) ) ## S3 method for class 'power_curve' print(x, data_used = FALSE, digits = 3, right = FALSE, row.names = FALSE, ...)power_curve( object, formula = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_args = list(), nls_control = list(), verbose = FALSE, models = c("nls", "logistic", "lm"), nls_options = list(min_points = 4, regions = c(0, 0.45, 0.75, 0.85, 1)) ) ## S3 method for class 'power_curve' print(x, data_used = FALSE, digits = 3, right = FALSE, row.names = FALSE, ...)
object |
An object of the class
|
formula |
A formula of the model
for |
start |
Either a named vector
of the start value(s) of parameter(s)
in |
lower_bound |
Either a named vector
of the lower bound(s) of parameter(s)
in |
upper_bound |
Either a named vector
of the upper bound(s) of parameter(s)
in |
nls_args |
A named list of
arguments to be used when calling
|
nls_control |
A named list of
arguments to be passed the |
verbose |
Logical. Whether messages will be printed when trying different models. |
models |
Models to try. Support
|
nls_options |
A named list of
options to be used by |
x |
A |
data_used |
Logical. Whether the rejection rates data frame used to fit the model is printed. |
digits, right, row.names
|
Arguments of the same names used
by the |
... |
For the |
The function power_curve()
retrieves the information
from the output of
power4test_by_n() or
power4test_by_es(), and
estimate the power curve: the
relation between the characteristic
varied, sample size for
power4test_by_n() and the
population effect size for
power4test_by_es(), and the
rejection rate of the test conducted
by power4test_by_n() or
power4test_by_es(). This
rejection rate is the power when the
null hypothesis is false (e.g., the
population value of the effect size
being tested is nonzero).
The model fitted is not intended to
be a precise model for the relation
across a wide range. It is only a
crude estimate based on the limited
number of values of the
characteristic (e.g., sample size)
examined, which can be as small as
four or even smaller. The model is
intended to be
used for only for the range covered,
and for estimating the probable
sample size or effect size with a
desirable level of power. This value
should then be studied by higher
precision through simulation
using functions such as
power4test().
These are the models to be tried, in the following order:
One or nonlinear models, to be
fitted by stats::nls(). If
several models are specified, all
will be fitted and the one with the
smallest deviance will be used.
If all the nonlinear models failed,
for whatever reason, a logistic
regression will be fitted by
stats::glm() to predict the
binary significant test results.
If the logistic model also failed, for whatever reason, a simple linear regression model will be fitted. Although the power curve is nonlinear across a wide range of, say, sample size, a linear model can still be a good enough approximation for a narrow range of the predictor.
The output can then be plotted to
visualize the power curve, using
the plot method (plot.power_curve())
for the output
of power_curve().
This function can be used directly,
but is also used internally by
functions such as x_from_power().
These are possible arguments.
min_points: If the object has
less than min_points rejection
rates, nls will not be used.
regions: A numeric vector with
values from 0 to 1. It defines
regions in which there must be
at least one rejection rate. If this
criterion is not met, nls will
not be used. This is to ensure that
nls will not be used when the
rejection rates are clustered around
a very narrow region.
It returns a list which is a
power_curve object, with the
following elements:
fit: The model fitted, which is the
output of stats::nls(),
stats::glm(), or stats::lm().
reject_df: The table of reject
rates and other characteristics,
which is generated by
rejection_rates().
predictor: The predictor or the
power curve, ether "n" (sample
size) or "es" (population effect
size).
call: The call used to run this
function.
The print method of power_curve
object returns x invisibly. Called
for its side-effect.
power4test_by_n() and power4test_by_es()
for the output supported by
power_curve(), plot.power_curve()
for the plot method and
predict.power_curve()
for the predict method of the output
of power_curve().
# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis sim_only <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes out1 <- power4test_by_n(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 50, 100), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 plot(pout1) # By pop_es: Do a test for different population values of a model parameter out2 <- power4test_by_es(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), pop_es_name = "y ~ x", pop_es_values = c(0, .3, .5), by_seed = 1234, parallel = FALSE, progress = FALSE) pout2 <- power_curve(out2) pout2 plot(pout2)# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis sim_only <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes out1 <- power4test_by_n(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 50, 100), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 plot(pout1) # By pop_es: Do a test for different population values of a model parameter out2 <- power4test_by_es(sim_only, nrep = 10, test_fun = test_parameters, test_args = list(par = "y~x"), pop_es_name = "y ~ x", pop_es_values = c(0, .3, .5), by_seed = 1234, parallel = FALSE, progress = FALSE) pout2 <- power_curve(out2) pout2 plot(pout2)
An all-in-one function that receives a model specification, generates datasets, fits a model, does the target test, and returns the test results.
power4test( object = NULL, nrep = NULL, ptable = NULL, model = NULL, pop_es = NULL, standardized = TRUE, n = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, x_fun = list(), e_fun = list(), process_data = NULL, fit_model_args = list(), R = NULL, ci_type = "mc", gen_mc_args = list(), gen_boot_args = list(), test_fun = NULL, test_args = list(), map_names = c(fit = "fit"), results_fun = NULL, results_args = list(), test_name = NULL, test_note = NULL, do_the_test = TRUE, sim_all = NULL, iseed = NULL, parallel = FALSE, progress = TRUE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51, sm = 0.2, ml = 0.4), es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15, sm = 0.075, ml = 0.125), es_ind = c("si", "mi", "li"), n_std = 1e+05, std_force_monte_carlo = FALSE, rejection_rates_args = list(collapse = "none", at_least_k = 1, merge_all_tests = FALSE, p_adjust_method = "none", alpha = 0.05), n_ratio = 1 ) ## S3 method for class 'power4test' print( x, what = c("data", "test"), digits = 3, digits_descriptive = 2, data_long = FALSE, test_long = TRUE, fit_to_all_args = list(), ... )power4test( object = NULL, nrep = NULL, ptable = NULL, model = NULL, pop_es = NULL, standardized = TRUE, n = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, x_fun = list(), e_fun = list(), process_data = NULL, fit_model_args = list(), R = NULL, ci_type = "mc", gen_mc_args = list(), gen_boot_args = list(), test_fun = NULL, test_args = list(), map_names = c(fit = "fit"), results_fun = NULL, results_args = list(), test_name = NULL, test_note = NULL, do_the_test = TRUE, sim_all = NULL, iseed = NULL, parallel = FALSE, progress = TRUE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51, sm = 0.2, ml = 0.4), es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15, sm = 0.075, ml = 0.125), es_ind = c("si", "mi", "li"), n_std = 1e+05, std_force_monte_carlo = FALSE, rejection_rates_args = list(collapse = "none", at_least_k = 1, merge_all_tests = FALSE, p_adjust_method = "none", alpha = 0.05), n_ratio = 1 ) ## S3 method for class 'power4test' print( x, what = c("data", "test"), digits = 3, digits_descriptive = 2, data_long = FALSE, test_long = TRUE, fit_to_all_args = list(), ... )
object |
Optional. If set to a
|
nrep |
The number of replications
to generate the simulated datasets.
Default is |
ptable |
The output of
|
model |
The |
pop_es |
The character vector or
multiline string to
specify population effect sizes
(population values of parameters). See
the help page on how to specify this
argument.
Ignored if |
standardized |
Logical. If
|
n |
The sample size for each dataset. Default is 100. |
number_of_indicators |
A named
vector to specify the number of
indicators for each factors. See
the help page on how to set this
argument. Default is |
reliability |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the reliability coefficient
of each set of indicators. Default
is |
loading_difference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the difference in factor
loadings between neighboring indicators
of each set of indicators. Default
is |
reference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to indicate which indicator will
be the first indicator (and so
is the reference indicator, by default).
Default
is |
x_fun |
The function(s) used to
generate the exogenous variables or
error terms. If
not supplied, or set to |
e_fun |
The function(s) used to
generate the error terms of indicators,
if any. If
not supplied, or set to |
process_data |
If not |
fit_model_args |
A list of the
arguments to be passed to |
R |
The number of replications
to generate the Monte Carlo or
bootstrapping estimates
for each fit output. No Monte Carlo
nor bootstrapping
estimates will be generated if |
ci_type |
The type of
simulation-based confidence
intervals to use. Can be either
|
gen_mc_args |
A list of
arguments to be passed to
|
gen_boot_args |
A list of
arguments to be passed to
|
test_fun |
A function to do the
test. See 'Details' for the requirement
of this function. There are some
built-in test functions in |
test_args |
A list of arguments
to be passed to the |
map_names |
A named character
vector specifying how the content of
the element |
results_fun |
The function to be
used to extract the test results.
See |
results_args |
A list of
arguments to be passed to the
|
test_name |
String. The name
of the test. Default is |
test_note |
String. An optional
note for the test, stored in the
attribute |
do_the_test |
If |
sim_all |
If set to either a
|
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
es1 |
Set the
values for each label of the effect
size (population value) for correlations and regression
paths.
Used only if |
es2 |
Set the
values for each label of the effect
size (population value) for product term.
Used only if |
es_ind |
The names of labels denoting the effect size of an indirect effect. They will be used to determine the population values of the component paths along an indirect path. |
n_std |
The sample size used to
determine the error variances by
simulation when |
std_force_monte_carlo |
Logical.
If |
rejection_rates_args |
Default
argument values to be used when
|
n_ratio |
If the model is a
multigroup model, and |
x |
The object to be printed. |
what |
A string vector of
what to print, |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
data_long |
If |
test_long |
If |
fit_to_all_args |
A named list
of arguments to be passed to
|
... |
Optional arguments to be passed to other print methods |
The function power4test() is an
all-in-one function for
estimating the power of a test for
a model, given the sample size
and effect sizes (population values
of model parameters).
An object of the class power4test,
which is a list with two elements:
sim_all: The output of sim_out().
test_all: A named list of the
output of do_test(). The names
are the values of test_name.
This list can have more than one
test because a call to
power4test() can add new tests
to a power4test object.
The print method of power4test
returns x invisibly. Called for
its side effect.
This is the workflow:
If object is an output
of the output
of a previous call to
power4test() with do_the_test
set to FALSE and so has only the
model and the simulated data,
the following steps
will be skipped and go directly to
doing the test.
Call sim_data() to determine
the population model and
generate the datasets, using
arguments such as model and
pop_es.
Call fit_model() to fit a
model to each of the datasets,
which is the population model
by default.
If R is not NULL and
ci_type = "mc", call
gen_mc() to generate Monte
Carlo estimates using
manymome::do_mc(). The estimates
can be used by supported functions
such as test_indirect_effect().
If R is not NULL and
ci_type = "boot", call
gen_boot() to generate
bootstrap estimates using
manymome::do_boot(). The estimates
can be used by supported functions
such as test_indirect_effect().
Merge the results into a
sim_out object by calling
sim_out().
If do_the_test is FALSE,
skip the remaining steps and
return a power4test object,
which contains only the data
generated and optionally the
Monte Carlo or bootstrap
estimates.
If do_the_test is TRUE, do
the test.
do_test() will be called to do
the test in the fit output of
each dataset.
Return a power4test object which
include the output of sim_out
and, if do_the_test is TRUE,
the output of do_test().
This function is to be used when users are interested only in the power of one or several tests on a particular aspect of the model, such as a parameter, given a specific effect sizes and sample sizes.
Detailed description on major arguments can be found in sections below.
NOTE: The technical internal workflow of
of power4test() can be found in
this page: https://sfcheung.github.io/power4mome/articles/power4test_workflow.html.
The function power4test() can also be used to
update a condition when only some
selected aspects is to be changed.
For example,
instead of calling this function with
all the arguments set just to change
the sample size, it can be called
by supplying an existing
power4test object and set only
n to a new sample size. The data
and the tests will be updated
automatically. See the examples for
an illustration.
The function power4test() can also be used to
add a test to the output from a
previous call to power4test().
For example, after simulating the
datasets and doing one test,
the output can be set to object
of power4test(), and set only
test_fun and, optionally,
test_fun_args to do one more test
on the generated datasets. The output
will be the original
object with the results of the new
test added. See the examples for
an illustration.
For power analysis, usually, the
population model (model) is to be
fitted, and there is no need to
set fit_model_args.
If power analysis is to be conducted
for fitting a model that is not the
population model, of if non-default
settings are desired when fitting
a model, then the argument fit_model_args
needed to be set to customize the
call to fit_model().
For example,
users may want to examine the power
of a test when a misspecified model
is fitted, or the power of a test
when MLR is used as the estimator
when calling lavaan::sem().
See the help page of fit_model()
for some examples.
For a single-group model, model
should be a lavaan model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es.
If the model has latent factors,
the syntax in model should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m, x, and y denote
observed variables or latent factors
are determined by other functions,
such as power4test().
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model is
specified as in the case of single
group models.
The argument pop_es is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
If pop_es is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan names for the selected
parameters. For example, m ~ x
denotes the path from x to m.
Alternatively, the names can be
either ".beta." or ".cov.".
Use ".beta." to set the default
values for all regression coefficients.
Use ".cov." to set the default
values for all correlations of
exogenous variables (e.g., predictors).
The names can also be of this form:
".ind.(<path>)", whether <path>
denote path in the model. For
example, ".ind.(x->m->y)" denotes
the path from x through m to
y. Alternatively, the lavaan
symbol ~ can also be used:
".ind.(y~m~x)". This form is used
to set the indirect effect (standardized,
by default) along this path. The
value for this name will override
other settings.
If using lavaan names, can
specify more than one parameter
using +. For example, y ~ m + x
denotes the two paths from m and
x to y.
The value of each element can be
the label for the effect size: n
for nil, s for small, m for
medium, and l for large. The
value for each label is determined
by es1 and es2. See the section
on specifying these two arguments.
The value of pop_es can also be
set to a value, but it must be
quoted as a string, such as "y ~ x" = ".31".
This is an example:
c(".beta." = "s",
"m1 ~ x" = "-m",
"m2 ~ m1" = "l",
"y ~ x:w" = "s")
In this example,
All regression coefficients are
set to small (s) by default, unless
specified otherwise.
The path from x to m1 is
set to medium and negative (-m).
The path from m1 to m2 is set
to large (l).
The coefficient of the product
term x:w when predicting y is
set to small (s).
When setting an indirect effect to
a symbol (default: "si", "mi",
"li", with "i" added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20" will
throw an error.
The argument pop_es also supports multigroup
models.
For pop_es, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or
keywords such as .beta. and
.cov., like specifying the
population values for a single
group model.
The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m",
"y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x is medium (m) for
all groups, while the population
values for the path y ~ m are
small (s), medium (m), and large (l),
respectively.
When setting the argument pop_es,
instead of using a named vector
or named list for
pop_es, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml().
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with tag:.
tag can be the name of a
parameter, in lavaan model
syntax format.
For example, m ~ x
denotes the path from x to m.
A tag in lavaan model syntax can
specify more than one parameter
using +.
For example, y ~ m + x
denotes the two paths from m and
x to y.
Alternatively, the tag can be
either .beta. or .cov..
Use .beta. to set the default
values for all regression coefficients.
Use .cov. to set the default
values for all correlations of
exogenous variables (e.g., predictors).
After each tag is the value of the population value:
-nil for nil (zero),
s for small,
m for medium, and
l for large.
si, mi, and li for
small, medium, and large a
standardized indirect effect,
respectively.
Note: n cannot be used in this mode.
The
value for each label is determined
by es1 and es2 as described
in ptable_pop().
The value can also be
set to a numeric value, such as
.30 or -.30.
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small, while
the path from m to y is large.
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed
by two or more lines. Each line
starts with a hyphen - and the
value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x to m for the two
groups are 0 (nil) and
small (s), respectively.
Another equivalent way to specify
the values are using [], on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta. and .cov. can
also be used for multigroup models.
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
The multiline string is parsed by yaml::read_yaml().
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
The vector es1 is for correlations,
regression coefficients, and
indirect effect, and the
vector es2 is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es.
The arguments number_of_indicators
and reliability are used to
specify the number of indicators
(e.g., items) for each factor,
and the population reliability
coefficient of each factor,
if the variables in the model
syntax are latent variables.
Optionally, loading_difference can
be used to generate indicators with
unequal standardized factor loadings,
and reference can be used to specify
the indicator with the medium,
strongest, or weakest standardized
factor loading as the first indicator,
which is used as the reference indicator
in lavaan.
If a variable in the model is to be
replaced by indicators in the generated
data, set
number_of_indicators to a named
numeric vector. The names are the
variables of variables with
indicators, as appeared in the
model syntax. The value of each
name is the number of indicators.
The
argument reliability should then be
set a named numeric vector (or list,
see the section on multigroup models)
to specify the population reliability
coefficient ("omega") of each set of
indicators. The population standardized factor
loadings are then computed to ensure
that the population reliability
coefficient is of the target value.
These are examples for a single group model:
number of indicator = c(m = 3, x = 4, y = 5)
The numbers of indicators for m,
x, and y are 3, 4, and 5,
respectively.
reliability = c(m = .90, x = .80, y = .70)
The population reliability
coefficients of m, x, and y are
.90, .80, and .70, respectively.
Multigroup models are supported.
The number of groups is inferred
from pop_es (see the help page
of ptable_pop()), or directly
from ptable.
For a multigroup model, the number of indicators for each variable must be the same across groups.
However, the population reliability
coefficients can be different
across groups. For a multigroup model
of k groups,
with one or more population reliability
coefficients differ across groups,
the argument reliability should be
set to a named list. The names are
the variables to which the population
reliability coefficients are to be
set. The element for each name is
either a single value for the common
reliability coefficient, or a
numeric vector of the reliability
coefficient of each group.
This is an example of reliability
for a model with 2 groups:
reliability = list(x = .80, m = c(.70, .80))
The reliability coefficients of x are
.80 in all groups, while the
reliability coefficients of m are
.70 in one group and .80 in another.
If all variables in the model has
the same number of indicators,
number_of_indicators can be set
to one single value.
Similarly, if all sets of indicators
have the same population reliability
in all groups, reliability can also
be set to one single value.
By default, variables and error
terms are generated
from a multivariate normal distribution.
If desired, users can supply the
function used to generate an exogenous
variable and error term by setting x_fun to
a named list.
The names of the list are the variables for which a user function will be used to generate the data.
Each element of the list must also be a list. The first element of this list, can be unnamed, is the function to be used. If other arguments need to be supplied, they should be included as named elements of this list.
For example:
x_fun = list(x = list(power4mome::rexp_rs),
w = list(power4mome::rbinary_rs,
p1 = .70)))
The variables x and w will be
generated by user-supplied functions.
For x, the function is
power4mome::rexp_rs. No additional
argument when calling this function.
For w, the function is
power4mome::rbinary_rx. The argument
p1 = .70 will be passed to this
function when generating the values
of w.
If a variable is an endogenous
variable (e.g., being predicted by
another variable in a model), then
x_fun is used to generate its
error term. Its implied population
distribution may still be different
from that generate by x_fun because
the distribution also depends on the
distribution of other variables
predicting it.
These are requirements for the user-functions:
They must return a numeric vector.
They mush has an argument n for
the number of values.
The population mean and standard deviation of the generated values must be 0 and 1, respectively.
The package power4mome has
helper functions for generating
values from some common nonnormal
distributions and then scaling them
to have population mean and standard
deviation equal to 0 and 1 (by default), respectively.
These are some of them:
To use x_fun, the variables must
have zero covariances with other
variables in the population. It is
possible to generate nonnormal
multivariate data but we believe this
is rarely needed when estimating
power before having the data.
The function set to test_fun,
the test function, usually
should work
on the output of lavaan::sem(),
lmhelprs::many_lm(), or
stats::lm(), but can also be a
function that works on the output
of the function set to fit_function
when calling fit_model() or
power4test() (see fit_model_args).
The function has two default requirements.
First, it has an argument fit, to
be set to the output of
lavaan::sem() or another output
stored in the element extra$fit of
a replication in the sim_all
object. (This requirement can be
relaxed; see the section on map_names.)
That is, the function definition
should be of this from: FUN(fit, ...). This is the form of all
test_* functions in power4mome.
If other arguments are to be passed
to the test function, they can be set
to test_args as a named list.
Second, the test function must returns an output that (a) can be processed by the results function (see below), or (b) is of the required format for the output of a results function (see the next section). If it already returns an output of the required format, then there is no need to set the results function.
The test results will be extracted
from the output of test_fun by the
function set to results_fun,
the results function. If
the test_fun already returns an
output of the expected format
(see below), then set results_fun
to NULL, the default. The output
of test_fun will be used for
estimating power.
The function set to results_fun
must accept the output of test_fun,
as the first argument, and return a
named list (which can be a data frame)
or a named vector with some
of the following
elements:
est: Optional. The estimate of a
parameter, if applicable.
se: Optional. The standard error
of the estimate, if applicable.
cilo: Optional. The lower limit of the
confidence interval, if applicable.
cihi: Optional. The upper limit of the
confidence interval, if applicable.
sig: Required. If 1, the test is
significant. If 0, the test is not
significant. If the test cannot be
done for any reason, it should be
NA.
The results can then be used to estimate the power or Type I error of the test.
For example, if
the null hypothesis is false, then
the proportion of significant, that
is, the mean of the values of sig
across replications, is the power.
The package power4mome has some ready-to-use
test functions:
Please refer to their help pages for examples.
This argument is for developers using
a test function that has a different
name for the argument of the fit
object ("fit", the default).
If test_fun is set to a function
that works on an output of, say,
lavaan::sem() but the argument name
for the output is not fit, the
mapping can be changed by
map_names.
For example,
lavaan::parameterEstimates()
receives an output of lavaan::sem()
and reports the test results of model
parameters. However, the argument
name for the lavaan output is
object. To instruct do_test() to
do the test correctly when setting
test_fun to
lavaan::parameterEstimates, add
map_names = c(object = "fit"). The
element fit in a replication will
then set to the argument object of
lavaan::parameterEstimates().
# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " # Set nrep to a large number in real analysis, such as 400 # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 100, test_fun = test_parameters, test_args = list(pars = "m~x"), iseed = 1234, parallel = FALSE, progress = TRUE) print(out, test_long = TRUE) # Change the sample size out1 <- power4test(out, n = 200, iseed = 1234, parallel = FALSE, progress = TRUE) print(out1, test_long = TRUE) # Add one more test out2 <- power4test(out, test_fun = test_parameters, test_args = list(pars = "y~x"), parallel = FALSE, progress = TRUE) print(out2, test_long = TRUE)# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " # Set nrep to a large number in real analysis, such as 400 # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis out <- power4test(nrep = 10, model = model_simple_med, pop_es = model_simple_med_es, n = 100, test_fun = test_parameters, test_args = list(pars = "m~x"), iseed = 1234, parallel = FALSE, progress = TRUE) print(out, test_long = TRUE) # Change the sample size out1 <- power4test(out, n = 200, iseed = 1234, parallel = FALSE, progress = TRUE) print(out1, test_long = TRUE) # Add one more test out2 <- power4test(out, test_fun = test_parameters, test_args = list(pars = "y~x"), parallel = FALSE, progress = TRUE) print(out2, test_long = TRUE)
Estimate power for a set of effect sizes (population values of a model parameter).
power4test_by_es( object, pop_es_name = NULL, pop_es_values = NULL, progress = TRUE, ..., by_seed = NULL, by_nrep = NULL, save_sim_all = TRUE ) ## S3 method for class 'power4test_by_es' c(..., sort = TRUE, skip_checking_models = FALSE) as.power4test_by_es(original_object, pop_es_name) ## S3 method for class 'power4test_by_es' print(x, print_all = FALSE, digits = 3, ...)power4test_by_es( object, pop_es_name = NULL, pop_es_values = NULL, progress = TRUE, ..., by_seed = NULL, by_nrep = NULL, save_sim_all = TRUE ) ## S3 method for class 'power4test_by_es' c(..., sort = TRUE, skip_checking_models = FALSE) as.power4test_by_es(original_object, pop_es_name) ## S3 method for class 'power4test_by_es' print(x, print_all = FALSE, digits = 3, ...)
object |
A |
pop_es_name |
The name of the
parameter. See the help page
of |
pop_es_values |
A numeric
vector of the population values
of the parameter specified in
|
progress |
Logical. Whether progress of the simulation will be displayed. |
... |
For |
by_seed |
If set to a number,
it will be used to generate the
seeds for each call to |
by_nrep |
If set to a number,
it will be used to generate the
number of replications ( |
save_sim_all |
If |
sort |
WHen combining the
objects, whether they will be sorted
by population values. Default is |
skip_checking_models |
Whether
the check of the data generation model
will be checked. Default is |
original_object |
The object
to be converted to a |
x |
The object to be printed. |
print_all |
If |
digits |
The numbers of digits displayed after the decimal. |
The function power4test_by_es()
regenerates
datasets for a set of effect sizes
(population values of a model parmeter)
and does the stored tests in each of
them.
Optionally, it can also be run
on a object with no stored tests.
In this case, additional arguments
must be set to instruct power4test()
on the tests to be conducted.
It is usually used to examine the power over a sets of effect sizes (population values).
The c method of power4test_by_es
objects
is used to combine tests from different
runs of power4test_by_es().
The function as.power4test_by_es()
is used to convert a power4test
object to a power4test_by_es
object, if it is not already one.
Useful when concatenating
power4test objects with
power4test_by_es objects.
The function
power4test_by_es() returns a
power4test_by_es object, which is a
list of power4test objects, one for
each population value of the parameter.
The method c.power4test_by_es() returns
a power4test_by_es object with
all the elements (tests for different
values of pop_es_values) combined.
The function as.power4test_by_es() returns
a power4test_by_es object converted
from the input object.
The print-method of power4test_by_es
objects returns the object invisibly.
It is called for its side-effect.
# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " sim_only <- power4test(nrep = 2, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 40, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) out <- power4test_by_es(test_out, pop_es_name = "y ~ m", pop_es_values = c(.10, .20)) out_reject <- rejection_rates(out) out_reject# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " sim_only <- power4test(nrep = 2, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 40, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) out <- power4test_by_es(test_out, pop_es_name = "y ~ m", pop_es_values = c(.10, .20)) out_reject <- rejection_rates(out) out_reject
Estimate power for a set of sample sizes.
power4test_by_n( object, n = NULL, progress = TRUE, ..., by_seed = NULL, by_nrep = NULL, save_sim_all = TRUE ) ## S3 method for class 'power4test_by_n' c( ..., sort = TRUE, skip_checking_models = FALSE, tolerance_if_std_by_monte_carlo = getOption("power4mome.tolerance_if_std_by_monte_carlo", default = 0.01) ) as.power4test_by_n(original_object) ## S3 method for class 'power4test_by_n' print(x, print_all = FALSE, ...)power4test_by_n( object, n = NULL, progress = TRUE, ..., by_seed = NULL, by_nrep = NULL, save_sim_all = TRUE ) ## S3 method for class 'power4test_by_n' c( ..., sort = TRUE, skip_checking_models = FALSE, tolerance_if_std_by_monte_carlo = getOption("power4mome.tolerance_if_std_by_monte_carlo", default = 0.01) ) as.power4test_by_n(original_object) ## S3 method for class 'power4test_by_n' print(x, print_all = FALSE, ...)
object |
A |
n |
A numeric vector of sample sizes for which the simulation will be conducted. |
progress |
Logical. Whether progress of the simulation will be displayed. |
... |
For |
by_seed |
If set to a number,
it will be used to generate the
seeds for each call to |
by_nrep |
If set to a number,
it will be used to generate the
number of replications ( |
save_sim_all |
If |
sort |
When combining the
objects, whether they will be sorted
by sample sizes. Default is |
skip_checking_models |
Whether
the check of the data generation model
will be checked. Default is |
tolerance_if_std_by_monte_carlo |
The tolerance to be used by |
original_object |
The object
to be converted to a |
x |
The object to be printed. |
print_all |
If |
The function power4test_by_n()
regenerates
datasets for a set of sample sizes
and does the stored tests in each of
them.
Optionally, it can also be run
on a object with no stored tests.
In this case, additional arguments
must be set to instruct power4test()
on the tests to be conducted.
It is usually used to examine the power over a set of sample sizes.
The c method of power4test_by_n
objects
is used to combine tests from different
runs of power4test_by_n().
The function as.power4test_by_n()
is used to convert a power4test
object to a power4test_by_n
object, if it is not already one.
Useful when concatenating
power4test objects with
power4test_by_n objects.
The function
power4test_by_n() returns a
power4test_by_n object, which is a
list of power4test objects, one for
each sample size.
The method c.power4test_by_n() returns
a power4test_by_n object with
all the elements (tests for different
sample sizes) combined.
The function as.power4test_by_n() returns
a power4test_by_n object converted
from the input object.
The print-method of power4test_by_n
objects returns the object invisibly.
It is called for its side-effect.
# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " sim_only <- power4test(nrep = 2, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 40, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) out <- power4test_by_n(test_out, n = c(100, 110, 120)) out_reject <- rejection_rates(out) out_reject# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " m ~ x: m y ~ m: l y ~ x: n " sim_only <- power4test(nrep = 2, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 40, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) out <- power4test_by_n(test_out, n = c(100, 110, 120)) out_reject <- rejection_rates(out) out_reject
Compute the predicted
values in a model fitted by
power_curve().
## S3 method for class 'power_curve' predict(object, newdata, ...)## S3 method for class 'power_curve' predict(object, newdata, ...)
object |
A |
newdata |
A data frame with
a column named |
... |
Additional arguments.
Passed to the corresponding
|
The predict method of power_curve
objects works in two modes.
If new
data is not supplied (through
newdata), it retrieves the stored
results and calls the corresponding
methods to compute the predicted
values, which are the predicted
rejection rates (power levels if
the null hypothesis is false,
e.g., the population effect size is
equal to zero).
If new data is supplied, such as a named list with a vector of sample sizes, they will be used to compute the predicted rejection rates.
It returns a numeric vector of the predicted rejection rates.
# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes out1 <- power4test_by_n(sim_only, nrep = 5, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 100, 200, 1000), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 predict(pout1, newdata = list(x = c(150, 250, 500)))# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: s " # Simulate datasets to check the model # Set `parallel` to TRUE for faster, usually much faster, analysis # Set `progress` to TRUE to display the progress of the analysis sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 50, fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234, parallel = FALSE, progress = FALSE) # By n: Do a test for different sample sizes out1 <- power4test_by_n(sim_only, nrep = 5, test_fun = test_parameters, test_args = list(par = "y~x"), n = c(25, 100, 200, 1000), by_seed = 1234, parallel = FALSE, progress = FALSE) pout1 <- power_curve(out1) pout1 predict(pout1, newdata = list(x = c(150, 250, 500)))
Generate the complete population model using the model syntax and user-specified effect sizes (population parameter values).
ptable_pop( model = NULL, pop_es = NULL, es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51, sm = 0.2, ml = 0.4), es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15, sm = 0.075, ml = 0.125), es_ind = c("si", "mi", "li"), standardized = TRUE, n_std = 1e+05, std_force_monte_carlo = FALSE, add_cov_for_moderation = TRUE ) model_matrices_pop(x, ..., drop_list_single_group = TRUE)ptable_pop( model = NULL, pop_es = NULL, es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51, sm = 0.2, ml = 0.4), es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15, sm = 0.075, ml = 0.125), es_ind = c("si", "mi", "li"), standardized = TRUE, n_std = 1e+05, std_force_monte_carlo = FALSE, add_cov_for_moderation = TRUE ) model_matrices_pop(x, ..., drop_list_single_group = TRUE)
model |
String. The model defined
by |
pop_es |
It can be a data frame
with these columns: |
es1 |
Set the
values for each label of the effect
size (population value) for correlations and regression
paths.
Used only if |
es2 |
Set the
values for each label of the effect
size (population value) for product term.
Used only if |
es_ind |
The names of labels denoting the effect size of an indirect effect. They will be used to determine the population values of the component paths along an indirect path. |
standardized |
Logical. If
|
n_std |
The sample size used to
determine the error variances by
simulation when |
std_force_monte_carlo |
Logical.
If |
add_cov_for_moderation |
Logical. If |
x |
It can be a 'lavaan' model
syntax, to be passed to |
... |
If |
drop_list_single_group |
If
|
The function ptable_pop() generates a lavaan
parameter table that can be used
to generate data based on the population
values of model parameters.
The function ptable_pop() returns
a lavaan parameter table of the
model, with the column start set to the
population values.
The function model_matrices_pop()
returns a lavaan LISREL-style model
matrices (like the output of
lavaan::lavInspect() with what
set to "free"), with matrix elements
set to the population values. If
x is the model syntax, it will be
stored in the attributes model.
If the model is a multigroup model
with k groups (k greater than 1),
then it returns a list of k lists
of lavaan LISREL-style model
matrices unless drop_list_single_group
is TRUE.
ptable_pop()
The function ptable_pop() is used by
the all-in-one function
power4test(). Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
For a single-group model, model
should be a lavaan model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es.
If the model has latent factors,
the syntax in model should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m, x, and y denote
observed variables or latent factors
are determined by other functions,
such as power4test().
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model is
specified as in the case of single
group models.
The argument pop_es is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
If pop_es is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan names for the selected
parameters. For example, m ~ x
denotes the path from x to m.
Alternatively, the names can be
either ".beta." or ".cov.".
Use ".beta." to set the default
values for all regression coefficients.
Use ".cov." to set the default
values for all correlations of
exogenous variables (e.g., predictors).
The names can also be of this form:
".ind.(<path>)", whether <path>
denote path in the model. For
example, ".ind.(x->m->y)" denotes
the path from x through m to
y. Alternatively, the lavaan
symbol ~ can also be used:
".ind.(y~m~x)". This form is used
to set the indirect effect (standardized,
by default) along this path. The
value for this name will override
other settings.
If using lavaan names, can
specify more than one parameter
using +. For example, y ~ m + x
denotes the two paths from m and
x to y.
The value of each element can be
the label for the effect size: n
for nil, s for small, m for
medium, and l for large. The
value for each label is determined
by es1 and es2. See the section
on specifying these two arguments.
The value of pop_es can also be
set to a value, but it must be
quoted as a string, such as "y ~ x" = ".31".
This is an example:
c(".beta." = "s",
"m1 ~ x" = "-m",
"m2 ~ m1" = "l",
"y ~ x:w" = "s")
In this example,
All regression coefficients are
set to small (s) by default, unless
specified otherwise.
The path from x to m1 is
set to medium and negative (-m).
The path from m1 to m2 is set
to large (l).
The coefficient of the product
term x:w when predicting y is
set to small (s).
When setting an indirect effect to
a symbol (default: "si", "mi",
"li", with "i" added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20" will
throw an error.
The argument pop_es also supports multigroup
models.
For pop_es, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or
keywords such as .beta. and
.cov., like specifying the
population values for a single
group model.
The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m",
"y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x is medium (m) for
all groups, while the population
values for the path y ~ m are
small (s), medium (m), and large (l),
respectively.
When setting the argument pop_es,
instead of using a named vector
or named list for
pop_es, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml().
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with tag:.
tag can be the name of a
parameter, in lavaan model
syntax format.
For example, m ~ x
denotes the path from x to m.
A tag in lavaan model syntax can
specify more than one parameter
using +.
For example, y ~ m + x
denotes the two paths from m and
x to y.
Alternatively, the tag can be
either .beta. or .cov..
Use .beta. to set the default
values for all regression coefficients.
Use .cov. to set the default
values for all correlations of
exogenous variables (e.g., predictors).
After each tag is the value of the population value:
-nil for nil (zero),
s for small,
m for medium, and
l for large.
si, mi, and li for
small, medium, and large a
standardized indirect effect,
respectively.
Note: n cannot be used in this mode.
The
value for each label is determined
by es1 and es2 as described
in ptable_pop().
The value can also be
set to a numeric value, such as
.30 or -.30.
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small, while
the path from m to y is large.
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed
by two or more lines. Each line
starts with a hyphen - and the
value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x to m for the two
groups are 0 (nil) and
small (s), respectively.
Another equivalent way to specify
the values are using [], on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta. and .cov. can
also be used for multigroup models.
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
The multiline string is parsed by yaml::read_yaml().
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
The vector es1 is for correlations,
regression coefficients, and
indirect effect, and the
vector es2 is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es.
model_matrices_pop()
The function model_matrices_pop()
generates models matrices with
population values, used by ptable_pop().
Users usually do not
call this function directly, though
developers can use this build their own
workflows to generate the data.
power4test(), and
pop_es_yaml() on an alternative
way to specify population values.
# Specify the model model1 <- " m1 ~ x + c1 m2 ~ m1 + x2 + c1 y ~ m2 + m1 + x + w + x:w + c1 " # Specify the population values model1_es <- c("m1 ~ x" = "-m", "m2 ~ m1" = "s", "y ~ m2" = "l", "y ~ x" = "m", "y ~ w" = "s", "y ~ x:w" = "s", "x ~~ w" = "s") ptable_final1 <- ptable_pop(model1, pop_es = model1_es) ptable_final1 # Use a multiline string, illustrated by a simpler model model2 <- " m ~ x y ~ m + x " model2_es_a <- c("m ~ x" = "s", "y ~ m" = "m", "y ~ x" = "nil") model2_es_b <- " m ~ x: s y ~ m: m y ~ x: nil " ptable_model2_a <- ptable_pop(model2, pop_es = model2_es_a) ptable_model2_b <- ptable_pop(model2, pop_es = model2_es_b) ptable_model2_a ptable_model2_b identical(ptable_model2_a, ptable_model2_b) # model_matrices_pop model_matrices_pop(ptable_final1) model_matrices_pop(model1, pop_es = model1_es)# Specify the model model1 <- " m1 ~ x + c1 m2 ~ m1 + x2 + c1 y ~ m2 + m1 + x + w + x:w + c1 " # Specify the population values model1_es <- c("m1 ~ x" = "-m", "m2 ~ m1" = "s", "y ~ m2" = "l", "y ~ x" = "m", "y ~ w" = "s", "y ~ x:w" = "s", "x ~~ w" = "s") ptable_final1 <- ptable_pop(model1, pop_es = model1_es) ptable_final1 # Use a multiline string, illustrated by a simpler model model2 <- " m ~ x y ~ m + x " model2_es_a <- c("m ~ x" = "s", "y ~ m" = "m", "y ~ x" = "nil") model2_es_b <- " m ~ x: s y ~ m: m y ~ x: nil " ptable_model2_a <- ptable_pop(model2, pop_es = model2_es_a) ptable_model2_b <- ptable_pop(model2, pop_es = model2_es_b) ptable_model2_a ptable_model2_b identical(ptable_model2_a, ptable_model2_b) # model_matrices_pop model_matrices_pop(ptable_final1) model_matrices_pop(model1, pop_es = model1_es)
All-in-one functions for estimating power or finding the region with target power for common mediation models.
q_power_mediation( model = NULL, pop_es = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_fun = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) ## S3 method for class 'q_power_mediation' print(x, mode = c("all", "region", "n", "power"), ...) ## S3 method for class 'q_power_mediation' plot(x, ...) ## S3 method for class 'q_power_mediation' summary(object, ...) q_power_mediation_simple( a = "m", b = "m", cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) q_power_mediation_serial( ab = c("m", "m"), ab_other = "n", cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) q_power_mediation_parallel( as = c("m", "m"), bs = c("m", "m"), cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, omnibus = c("all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") )q_power_mediation( model = NULL, pop_es = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_fun = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) ## S3 method for class 'q_power_mediation' print(x, mode = c("all", "region", "n", "power"), ...) ## S3 method for class 'q_power_mediation' plot(x, ...) ## S3 method for class 'q_power_mediation' summary(object, ...) q_power_mediation_simple( a = "m", b = "m", cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) q_power_mediation_serial( ab = c("m", "m"), ab_other = "n", cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") ) q_power_mediation_parallel( as = c("m", "m"), bs = c("m", "m"), cp = "n", number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, omnibus = c("all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, test_more_args = list(), target_power = 0.8, nrep = NULL, n = NULL, R = 1000, ci_type = c("mc", "boot"), seed = NULL, iseed = NULL, parallel = TRUE, progress = TRUE, simulation_progress = NULL, max_trials = NULL, algorithm = NULL, ..., mode = c("power", "region", "n") )
model |
The |
pop_es |
The character vector or
multiline string to
specify population effect sizes
(population values of parameters). See
the help page on how to specify this
argument.
Ignored if |
number_of_indicators |
A named
vector to specify the number of
indicators for each factors. See
the help page on how to set this
argument. Default is |
reliability |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the reliability coefficient
of each set of indicators. Default
is |
loading_difference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the difference in factor
loadings between neighboring indicators
of each set of indicators. Default
is |
reference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to indicate which indicator will
be the first indicator (and so
is the reference indicator, by default).
Default
is |
test_fun |
A function to do the
test. See 'Details' of |
test_more_args |
A named list of
additional arguments to be passed
to the test function
( |
target_power |
The target power, a value greater than 0 and less than one. |
nrep |
The number of replications
to generate the simulated datasets.
Default is |
n |
The sample size for the
first run of |
R |
The number of replications
to generate the Monte Carlo or
bootstrapping estimates
for each fit output. No Monte Carlo
nor bootstrapping
estimates will be generated if |
ci_type |
The type of
simulation-based confidence
intervals to use. Can be either
|
seed |
The seed for the random
number generator. Used by
|
iseed |
The seed for the random
number generator. Used by
|
parallel |
If |
progress |
If |
simulation_progress |
Logical.
Whether the progress in each call
to |
max_trials |
The maximum number
of trials in searching the value
with the target power. Rounded
up if not an integer.
If |
algorithm |
The algorithm to be
used in mode |
... |
For |
mode |
For |
x |
The object for the relevant methods. |
object |
For the |
a |
For a simple mediation
model, this is the population effect
size for the path from |
b |
For a simple mediation
model, this is the population effect
size for the path from |
cp |
For a simple mediation
model, this is the population effect
size for the direct path from
|
ab |
For a serial mediation
model, this is a numeric vector
of the population effect
sizes along the path
|
ab_other |
Should be one single
value. This is the population effect
sizes of all other paths not along
|
as |
For a parallel mediation
model, this is a numeric vector
of the population effect
sizes for the paths from |
bs |
For a parallel mediation
model, this is a numeric vector
of the population effect
sizes for the paths from the
mediators to |
omnibus |
|
at_least_k |
The minimum number
of paths required to be significant
for the omnibus test to be considered
significant. Used when
|
A named list of outputs.
If mode is power, then a
power4test object is set to the
element power4test.
If mode is region, then a
n_region_from_power object is
set of the element n_region_from_power.
If mode is n_from_power, then a
n_from_power object is
set to the n_from_power element.
The print method of q_power_mediation
returns x invisibly. Called for
its side effect.
The plot-method of q_power_mediation
returns NULL.
It will plot either the output of
n_region_from_power()
(mode "region") or the output of
n_from_power() (mode "n").
If no output from
either of these functions is available,
nothing wil be plotted.
The summary method for
q_power_mediation objects returns
the output of summary() for
either the output of n_region_from_power()
(mode "region") or the output of
n_from_power() (mode "n").
Return NULL if no output from
either of these functions is available.
These functions are wrappers
that call power4test() and
n_region_from_power() to (a)
estimate the level of power for
a mediation model, given the population
effects and the sample size, and
(b) find the region of sample sizes
with the levels of power not
significantly different from the
target power.
They are convenient functions that
set the argument values automatically
for common mediation models before
calling power4test() and
n_region_from_power(). Please refer
to the help pages of these two
functions for the details on how
the estimation and the search are
conducted.
For some arguments not described in
details here, please refer to the
help pages of power4test()
and n_region_from_power(),
The function q_power_mediation_simple()
can be used for the power analysis of
a simple mediation model with only
one mediator.
This function will fit the following model:
"m ~ x y ~ m + x"
The function q_power_mediation_serial()
can be used for the power analysis of
a serial mediation model with only
any number of mediators.
This is the model being fitted if the model has two mediators:
"m1 ~ x m2 ~ m1 + x y ~ m2 + m1 + x"
The function q_power_mediation_parallel()
can be used for the power analysis of
a parallel mediation model with only
any number of mediators.
This is the model being fitted if the model has two mediators:
"m1 ~ x m2 ~ x y ~ m2 + m1 + x"
The function q_power_mediation(),
an advanced function,
can be used for the power analysis of
an arbitrary mediation model.
The model and the population effect
sizes are specified as in power4test().
This is an example of a model with both parallel paths and serial paths:
model <- " m1 ~ x m21 ~ m1 m22 ~ m1 y ~ m21 + m22 + x "
pop_es <- " m1 ~ x: m m21 ~ m1: m m22 ~ m1: m y ~ m21: m y ~ m22: m "
Knowledge of using power4test()
is required to use this advanced
function.
If this advanced function is used,
users need to specify test_fun
as when using power4test(), and
need to set test_args correctly
See power4test() and
n_region_from_power() for full
details on how these functions
work.
## Not run: # An arbitrary mediation model model <- " m1 ~ x m21 ~ m1 m22 ~ m1 y ~ m21 + m22 " pop_es <- " m1 ~ x: m m21 ~ m1: m m22 ~ m1: m y ~ m21: m y ~ m22: m " # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outa1 <- q_power_mediation( model = model, pop_es = pop_es, n = 100, R = 199, test_fun = test_k_indirect_effects, test_more_args = list(x = "x", y = "y", omnibus = "all"), seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outa1 summary(outa1) plot(outa1) ## End(Not run) # Simple mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. out <- q_power_mediation_simple( a = "m", b = "m", cp = "n", n = 50, R = 79, seed = 1234, nrep = 10, parallel = FALSE, progress = FALSE ) out # If mode = "region" is added, can call the following # summary(out) # plot(out) ## Not run: # Serial mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outs <- q_power_mediation_serial( ab = c("s", "m", "l"), ab_others = "n", cp = "s", n = 50, R = 199, seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outs summary(outs) plot(outs) ## End(Not run) ## Not run: # Parallel mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outp <- q_power_mediation_parallel( as = c("s", "m"), bs = c("m", "s"), cp = "n", n = 100, R = 199, seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outp summary(outp) plot(outp) ## End(Not run)## Not run: # An arbitrary mediation model model <- " m1 ~ x m21 ~ m1 m22 ~ m1 y ~ m21 + m22 " pop_es <- " m1 ~ x: m m21 ~ m1: m m22 ~ m1: m y ~ m21: m y ~ m22: m " # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outa1 <- q_power_mediation( model = model, pop_es = pop_es, n = 100, R = 199, test_fun = test_k_indirect_effects, test_more_args = list(x = "x", y = "y", omnibus = "all"), seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outa1 summary(outa1) plot(outa1) ## End(Not run) # Simple mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. out <- q_power_mediation_simple( a = "m", b = "m", cp = "n", n = 50, R = 79, seed = 1234, nrep = 10, parallel = FALSE, progress = FALSE ) out # If mode = "region" is added, can call the following # summary(out) # plot(out) ## Not run: # Serial mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outs <- q_power_mediation_serial( ab = c("s", "m", "l"), ab_others = "n", cp = "s", n = 50, R = 199, seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outs summary(outs) plot(outs) ## End(Not run) ## Not run: # Parallel mediation model # NOTE: In real power analysis: # - Set R to an appropriate value. # - Remove nrep or set nrep to the desired value. # - Remove parallel or set it to TRUE to enable parallel processing. # - Remove progress or set it to TRUE to see the progress. outp <- q_power_mediation_parallel( as = c("s", "m"), bs = c("m", "s"), cp = "n", n = 100, R = 199, seed = 1234, mode = "region", nrep = 20, parallel = FALSE, progress = FALSE ) outp summary(outp) plot(outp) ## End(Not run)
Generate random numbers from a beta distribution, rescaled to have user-specified population mean and standard deviation.
rbeta_rs(n = 10, shape1 = 0.5, shape2 = 0.5, pmean = 0, psd = 1)rbeta_rs(n = 10, shape1 = 0.5, shape2 = 0.5, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
shape1 |
|
shape2 |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
First, specify the two parameters,
shape1 and shape2, and the
desired population mean and standard
deviation. The random numbers, drawn
from a beta distribution by
stats::rbeta() will then be
rescaled with the desired population
mean and standard deviation.
A vector of the generated random numbers.
set.seed(90870962) x <- rbeta_rs(n = 5000, shape1 = .5, shape2 = .5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- rbeta_rs(n = 5000, shape1 = .5, shape2 = .5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)
Generate random numbers from a beta distribution, rescaled to have user-specified population mean and standard deviation, and within a specific range.
rbeta_rs2(n = 10, bmean, bsd, blow = 0, bhigh = 1)rbeta_rs2(n = 10, bmean, bsd, blow = 0, bhigh = 1)
n |
The number of random numbers to generate. |
bmean |
The population mean. |
bsd |
The population standard
deviation. If |
blow |
The lower bound of the target range. |
bhigh |
The upper bound of the target range. |
First, specify the two parameters,
shape1 and shape2, and the
desired population mean and standard
deviation. The random numbers, drawn
from a beta distribution by
stats::rbeta() will then be
rescaled to the desired population range.
A vector of the generated random numbers.
set.seed(90870962) x <- rbeta_rs2(n = 5000, bmean = .80, bsd = .10, blow = .00, bhigh = .95) mean(x) sd(x) hist(x) y <- rbeta_rs2(n = 5000, bmean = 4, bsd = 3, blow = -10, bhigh = 10) mean(y) sd(y) hist(y)set.seed(90870962) x <- rbeta_rs2(n = 5000, bmean = .80, bsd = .10, blow = .00, bhigh = .95) mean(x) sd(x) hist(x) y <- rbeta_rs2(n = 5000, bmean = 4, bsd = 3, blow = -10, bhigh = 10) mean(y) sd(y) hist(y)
Generate random numbers from a distribution of 0 or 1, rescaled to have user-specified population mean and standard deviation.
rbinary_rs(n = 10, p1 = 0.5, pmean = 0, psd = 1)rbinary_rs(n = 10, p1 = 0.5, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
p1 |
The probability of being 1, before rescaling. |
pmean |
Population mean. |
psd |
Population standard deviation. |
First, specify probability of 1
(p1), and the desired population
mean and standard deviation. The
random numbers, drawn from a
distribution of 0 (1 - p1
probability) and 1 (p1
probability), will then be rescaled
with the desired population mean and
standard.
A vector of the generated random numbers.
set.seed(90870962) x <- rbinary_rs(n = 5000, p1 = .5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- rbinary_rs(n = 5000, p1 = .5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)
Get all rejection rates
of all tests stored in a power4test
object or other supported objects.
rejection_rates(object, ...) ## Default S3 method: rejection_rates(object, ...) ## S3 method for class 'power4test' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, collapse = NULL, at_least_k = NULL, merge_all_tests = NULL, p_adjust_method = NULL, alpha = NULL, keep_nrep = FALSE, ... ) ## S3 method for class 'power4test_by_es' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, nrep_if_diff = TRUE, ... ) ## S3 method for class 'power4test_by_n' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, nrep_if_diff = TRUE, ... ) ## S3 method for class 'x_from_power' rejection_rates(object, ...) ## S3 method for class 'n_region_from_power' rejection_rates(object, ...) ## S3 method for class 'q_power_mediation' rejection_rates(object, ...) ## S3 method for class 'rejection_rates_df' print(x, digits = 3, annotation = TRUE, abbreviate_col_names = TRUE, ...)rejection_rates(object, ...) ## Default S3 method: rejection_rates(object, ...) ## S3 method for class 'power4test' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, collapse = NULL, at_least_k = NULL, merge_all_tests = NULL, p_adjust_method = NULL, alpha = NULL, keep_nrep = FALSE, ... ) ## S3 method for class 'power4test_by_es' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, nrep_if_diff = TRUE, ... ) ## S3 method for class 'power4test_by_n' rejection_rates( object, all_columns = FALSE, ci = TRUE, level = 0.95, se = FALSE, nrep_if_diff = TRUE, ... ) ## S3 method for class 'x_from_power' rejection_rates(object, ...) ## S3 method for class 'n_region_from_power' rejection_rates(object, ...) ## S3 method for class 'q_power_mediation' rejection_rates(object, ...) ## S3 method for class 'rejection_rates_df' print(x, digits = 3, annotation = TRUE, abbreviate_col_names = TRUE, ...)
object |
The object
from which the rejection rates
are to be computed, such as
a |
... |
Optional arguments. For
the |
all_columns |
If |
ci |
If |
level |
The level of confidence
for the confidence intervals, if
|
se |
If |
collapse |
Whether a single
decision (significant vs. not significant)
is made across all tests for a test
that consists of several tests
(e.g., the tests of several parameters).
If |
at_least_k |
Used by |
merge_all_tests |
If |
p_adjust_method |
The method to be
passed to |
alpha |
The level of significance
to use when using |
keep_nrep |
If |
nrep_if_diff |
If |
x |
The |
digits |
The number of digits to be printed after the decimal. |
annotation |
Logical. Whether additional notes will be printed. |
abbreviate_col_names |
Logical. Whether some column names will be abbreviated. |
For a power4test object,
rejection_rates loops over the tests stored
in a power4test object and retrieves
the rejection rate of each test.
The rejection_rates method for
power4test_by_es objects
is used to compute the rejection
rates from a power4test_by_es
object, with effect sizes added to
the output.
The rejection_rates method for
power4test_by_n objects
is used to compute the rejection
rates, with sample sizes added to
the output.
The rejection_rates method for
x_from_power objects
is used to compute the rejection
rates for stored trials. It supports
the output of x_from_power() and
its wrappers, such as n_from_power().
The rejection_rates method for
n_region_from_power objects
is used to compute the rejection
rates for stored trials. It supports
the output of n_region_from_power().
It is sufficient to retrieve the
trials in searching for the upper bound
because they also include the trials
used in searching for the lower bound.
The rejection_rates method for
q_power_mediation objects
is used to compute the rejection
rates for stored trials.
The rejection_rates method returns
a rejection_rates_df object,
with a print method.
If the input (object) is a
power4test object, the
rejection_rates_df object is
a data-frame like object with the
number of
rows equal to the number of tests.
Note that some tests, such as
the test by test_parameters(),
conduct one test for each parameter.
Each such test is counted as one
test.
The data frame has at least these columns:
test: The name of the test.
label: The label for each
test, or "Test" if a test only
does one test (e.g., test_indirect_effect()).
pvalid: The proportion of valid
tests across all replications.
reject: The rejection rate for
each test. If the null hypothesis
is false, then this is the power.
The rejection_rates method
for power4test_by_es objects
returns an object of the
class rejection_rates_df_by_es,
which is a subclass of
rejection_rates_df.
It is a data frame which is
similar to the output of
rejection_rates(), with two
columns added for the effect size (pop_es_name and
pop_es_values)
for each test.
The rejection_rates method
for power4test_by_n objects
returns an object of the
class rejection_rates_df_by_n,
which is a subclass of
rejection_rates_df.
It is a data frame which is
similar to the output of
a power4test object, with a
column n added for the sample size
for each test.
The rejection_rates method
for x_from_power objects
retrieves the stored
power4test_by_n or
power4test_by_es object,
and then
runs rejection_rates on it
and returns the result.
The rejection_rates method
for n_region_from_power objects
retrieves the stored
power4test_by_n object from
the above element (the search
for the region with power significantly
above the target power)
and then
runs rejection_rates on it
and returns the result.
The rejection_rates method
for q_power_mediation objects
retrieves the trials from
and then
runs rejection_rates on them
and returns the result. If mode
is "n", then the stored output
from n_from_power() is used.
If mode is "region", then
the stored output from n_region_from_power()
is used. If mode is "power", then
the stored output from power4test()
is used.
The print method of a
rejection_rates_df object returns
the object invisibly. It is called
for its side-effect.
Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association, 22(158), 209-212. doi:10.1080/01621459.1927.10502953
power4test(),
power4test_by_n(), and
power4test_by_es(), which are
supported by this method.
# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: n " # Generate some datasets to check the model sim_only <- power4test(nrep = 4, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 50, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) # Do the test 'test_indirect_effect' on each datasets test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) # Do the test 'test_parameters' on each datasets # and add the results to 'test_out' test_out <- power4test(object = test_out, test_fun = test_parameters) # Compute and print the rejection rates for stored tests rejection_rates(test_out) # See the help pages of power4test_by_n() and power4test_by_es() # for other examples.# Specify the population model model_simple_med <- " m ~ x y ~ m + x " # Specify the effect sizes (population parameter values) model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: n " # Generate some datasets to check the model sim_only <- power4test(nrep = 4, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 50, ci_type = "boot", fit_model_args = list(fit_function = "lm"), do_the_test = FALSE, iseed = 1234) # Do the test 'test_indirect_effect' on each datasets test_out <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", boot_ci = TRUE, mc_ci = FALSE)) # Do the test 'test_parameters' on each datasets # and add the results to 'test_out' test_out <- power4test(object = test_out, test_fun = test_parameters) # Compute and print the rejection rates for stored tests rejection_rates(test_out) # See the help pages of power4test_by_n() and power4test_by_es() # for other examples.
Generate random numbers from an exponential distribution, rescaled to have user-specified population mean and standard deviation.
rexp_rs(n = 10, rate = 1, pmean = 0, psd = 1, rev = FALSE)rexp_rs(n = 10, rate = 1, pmean = 0, psd = 1, rev = FALSE)
n |
The number of random numbers to generate. |
rate |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
rev |
If TRUE, the distribution is revered to generate a negatively skewed distribution. Default is FALSE. |
First, specify the parameter,
rate, and the
desired population mean and standard
deviation. The random numbers, drawn
from an exponential distribution by
stats::rexp(), will then be
rescaled with the desired population
mean and standard.
A vector of the generated random numbers.
set.seed(90870962) x <- rexp_rs(n = 5000, rate = 4, pmean = 3, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- rexp_rs(n = 5000, rate = 4, pmean = 3, psd = 1) mean(x) sd(x) hist(x)
Generate random numbers from a lognormal distribution, rescaled to have user-specified population mean and standard deviation.
rlnorm_rs(n = 10, mui = 0, sigma = 1, rev = FALSE, pmean = 0, psd = 1)rlnorm_rs(n = 10, mui = 0, sigma = 1, rev = FALSE, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
mui |
The parameter |
sigma |
The parameter |
rev |
If TRUE, the distribution is revered to generate a negatively skewed distribution. Default is FALSE. |
pmean |
Population mean. |
psd |
Population standard deviation. |
First, specify the parameter,
mui and sigma, and the
desired population mean and standard
deviation. The random numbers, drawn
from a lognormal distribution by
stats::rlnorm(), will then be
rescaled with the desired population
mean and standard.
A vector of the generated random numbers.
set.seed(90870962) x <- rlnorm_rs(n = 5000, mui = 0, sigma = 1, pmean = 0, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- rlnorm_rs(n = 5000, mui = 0, sigma = 1, pmean = 0, psd = 1) mean(x) sd(x) hist(x)
Generate random numbers from generalized normal distribution, rescaled to have user-specified population mean and standard deviation.
rpgnorm_rs(n = 10, p = 2, pmean = 0, psd = 1)rpgnorm_rs(n = 10, p = 2, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
p |
The parameter of the distribution. Must be a positive non-zero number. Default is 2, resulting in a normal distribution. Higher than 2 results in negative excess kurtosis. Lower than 2 results in positive excess kurtosis. |
pmean |
Population mean. |
psd |
Population standard deviation. |
First, specify the parameter p
and the desired population
mean and standard deviation. The
random numbers, drawn from a
generalized normal distribution by
pgnorm::rpgnorm(), will then be
rescaled with the desired population
mean and standard.
A vector of the generated random numbers.
set.seed(90870962) x <- rpgnorm_rs(n = 5000, p = 2, pmean = 0, psd = 1) mean(x) sd(x) hist(x) x_kurt <- function(p) {gamma(5/p)*gamma(1/p)/(gamma(3/p)^2) - 3} p <- 6 x <- rpgnorm_rs(n = 50000, p = p, pmean = 0, psd = 1) mean(x) sd(x) x_kurt(p) qqnorm(x); qqline(x) p <- 1 x <- rpgnorm_rs(n = 50000, p = p, pmean = 0, psd = 1) mean(x) sd(x) x_kurt(p) qqnorm(x); qqline(x)set.seed(90870962) x <- rpgnorm_rs(n = 5000, p = 2, pmean = 0, psd = 1) mean(x) sd(x) hist(x) x_kurt <- function(p) {gamma(5/p)*gamma(1/p)/(gamma(3/p)^2) - 3} p <- 6 x <- rpgnorm_rs(n = 50000, p = p, pmean = 0, psd = 1) mean(x) sd(x) x_kurt(p) qqnorm(x); qqline(x) p <- 1 x <- rpgnorm_rs(n = 50000, p = p, pmean = 0, psd = 1) mean(x) sd(x) x_kurt(p) qqnorm(x); qqline(x)
Generate random numbers from a t distribution, rescaled to have user-specified population mean and standard deviation.
rt_rs(n = 10, df = 5, pmean = 0, psd = 1)rt_rs(n = 10, df = 5, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
df |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
First, specify the parameter df
and the desired population
mean and standard deviation. The
random numbers, drawn from the
generalized normal distribution by
stats::rt(), will then be
rescaled with the desired population
mean and standard.
A vector of the generated random numbers.
set.seed(90870962) x <- rt_rs(n = 5000, df = 5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- rt_rs(n = 5000, df = 5, pmean = 3, psd = 1) mean(x) sd(x) hist(x)
Generate random numbers from a uniform distribution, with user-specified population mean and standard deviation.
runif_rs(n = 10, min = 0, max = 1, pmean = 0, psd = 1)runif_rs(n = 10, min = 0, max = 1, pmean = 0, psd = 1)
n |
The number of random numbers to generate. |
min |
min for runif. |
max |
max for runif. |
pmean |
Population mean. |
psd |
Population standard deviation. |
First, the user specifies the parameters, min and max, and the desired population mean and standard deviation. Then the random numbers will be generated and rescaled with the desired population mean and standard.
A vector of the generated random numbers.
set.seed(90870962) x <- runif_rs(n = 5000, min = 2, max = 4, pmean = 3, psd = 1) mean(x) sd(x) hist(x)set.seed(90870962) x <- runif_rs(n = 5000, min = 2, max = 4, pmean = 3, psd = 1) mean(x) sd(x) hist(x)
For the process_data
argument. To compute scale scores
from indicators and replace the
indicators scores by computed
scale scores.
scale_scores(data, method = c("mean", "sum"))scale_scores(data, method = c("mean", "sum"))
data |
A data frame with the
indicator scores. It must has an
attribute |
method |
The method to be used
to compute the scale scores. Can
be |
This function is to be used in
the process_data argument of
power4test().
It retrieves the attribute
"number_of_indicators",
stored by power4test(), to identify
factors with indicators, computes
the scale scores based on method,
and replace the indicators by the
scale scores.
All subsequent steps, such as the test functions, will see only the scale scores, or original scores if a variable has no indicator. The model will also be fitted on the scale scores, not on the indicators.
It can be used to estimate power for analyzing the scale scores, taking into account the measurement error due to imperfect reliability.
It returns a data frame with the scale scores computed.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Specify the numbers of indicators and reliability coefficients k <- c(y = 3, m = 4, x = 5) rel <- c(y = .70, m = .70, x = .70) # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, number_of_indicators = k, reliability = rel, process_data = list(fun = "scale_scores"), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat)# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Specify the numbers of indicators and reliability coefficients k <- c(y = 3, m = 4, x = 5) rel <- c(y = .70, m = .70, x = .70) # Simulate the data out <- power4test( nrep = 2, model = mod, pop_es = mod_es, n = 200, number_of_indicators = k, reliability = rel, process_data = list(fun = "scale_scores"), test_fun = test_parameters, test_args = list(op = "~"), parallel = FALSE, iseed = 1234) dat <- pool_sim_data(out) head(dat)
Get a model matrix and effect size specification and simulate a number of datasets, along with other information.
The function
sim_data( nrep = 10, ptable = NULL, model = NULL, pop_es = NULL, ..., n = 100, iseed = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, x_fun = list(), e_fun = list(), process_data = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), n_ratio = 1, cl = NULL ) ## S3 method for class 'sim_data' print( x, digits = 3, digits_descriptive = 2, data_long = TRUE, fit_to_all_args = list(), est_type = "standardized", variances = NULL, pure_x = TRUE, pure_y = TRUE, ... ) pool_sim_data(object, as_list = FALSE)sim_data( nrep = 10, ptable = NULL, model = NULL, pop_es = NULL, ..., n = 100, iseed = NULL, number_of_indicators = NULL, reliability = NULL, loading_difference = NULL, reference = NULL, x_fun = list(), e_fun = list(), process_data = NULL, parallel = FALSE, progress = FALSE, ncores = max(1, parallel::detectCores(logical = FALSE) - 1), n_ratio = 1, cl = NULL ) ## S3 method for class 'sim_data' print( x, digits = 3, digits_descriptive = 2, data_long = TRUE, fit_to_all_args = list(), est_type = "standardized", variances = NULL, pure_x = TRUE, pure_y = TRUE, ... ) pool_sim_data(object, as_list = FALSE)
nrep |
The number of replications to generate the simulated datasets. Default is 10. |
ptable |
The output of
|
model |
The |
pop_es |
The character to
specify population effect sizes.
See ptable_pop on
how to specify this argument.
Ignored if |
... |
For sim_data, parameters
to be passed to |
n |
The sample size for each dataset. Default is 100. |
iseed |
The seed for the random
number generator. Default is |
number_of_indicators |
A named
vector to specify the number of
indicators for each factors. See
the help page on how to set this
argument. Default is |
reliability |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the reliability coefficient
of each set of indicators. Default
is |
loading_difference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the difference in factor
loadings between neighboring indicators
of each set of indicators. Default
is |
reference |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to indicate which indicator will
be the first indicator (and so
is the reference indicator, by default).
Default
is |
x_fun |
The function(s) used to
generate the exogenous variables or
error terms. If
not supplied, or set to |
e_fun |
The function(s) used to
generate the error terms of indicators,
if any. If
not supplied, or set to |
process_data |
If not |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
n_ratio |
If the model is a
multigroup model, and |
cl |
A cluster, such as one created
by |
x |
The |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
data_long |
If |
fit_to_all_args |
A named list
of arguments to be passed to
|
est_type |
The type of estimates
to be printed. Can be a character
vector of one to two elements. If
only |
variances |
Logical. Whether
variances and error variances are printed.
Default depends on |
pure_x, pure_y
|
When Logical. When printing indirect effects, whether only "pure" x-variables (variables not predicted by another other variables) and/or "pure" y-variables (variables that do not predict any other variables other than indicators) will be included in enumerating the paths. |
object |
Either a |
as_list |
Logical. If |
The function sim_data() generates
a list of datasets based on a population
model.
The function sim_out() returns
a list of the class sim_data,
with length nrep. Each element
is a sim_data_i object, with
the following major elements:
ptable: A lavaan parameter
table of the model, with population
values set in the column start.
(It is the output of the
function ptable_pop().)
mm_out: The population model
represented by model matrices
as in lavaan. (It is the output
of the function
model_matrices_pop().)
mm_lm_out: A list of regression
model formula, one for each
endogenous variable. (It is the
output of the internal function
mm_lm().)
mm_lm_dat_out: A simulated dataset
generated from the population model.
(It is the output of the internal
function mm_lm_data()).
model_original: The original model
syntax (i.e., the argument model).
model_final: A modified model
syntax if the model is a latent
variable model. Indicators are added
to the syntax.
fit0: The output of lavaan::sem()
with ptable as the model and
do.fit set to FALSE. Used for the
easy retrieval of information
about the model.
The print method of sim_data
returns x invisibly. It is called for
its side effect.
The function pool_sim_data() returns
either one data frame or a list
of data frames, depending on the
argument as_list
sim_data()
The function sim_data() is used by
the all-in-one function
power4test(). Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
The function sim_data() does two tasks:
Determine the actual population model with population values based on:
A model syntax for the observed variables (for a path model) or the latent factors (for a latent variable model).
A textual specification of the effect sizes of parameters.
The number of indicators for each latent factor if the model is a latent variable model.
The reliability of each latent factor as measured by the indicators if the model is a latent factor model.
Generate nrep simulated datasets from the population model.
The simulated datasets can then be used to fit a model, test parameters, and estimate power.
The output is usually used by
fit_model() to fit a target model,
by default the population model, to each
of the dataset.
The arguments number_of_indicators
and reliability are used to
specify the number of indicators
(e.g., items) for each factor,
and the population reliability
coefficient of each factor,
if the variables in the model
syntax are latent variables.
Optionally, loading_difference can
be used to generate indicators with
unequal standardized factor loadings,
and reference can be used to specify
the indicator with the medium,
strongest, or weakest standardized
factor loading as the first indicator,
which is used as the reference indicator
in lavaan.
If a variable in the model is to be
replaced by indicators in the generated
data, set
number_of_indicators to a named
numeric vector. The names are the
variables of variables with
indicators, as appeared in the
model syntax. The value of each
name is the number of indicators.
The
argument reliability should then be
set a named numeric vector (or list,
see the section on multigroup models)
to specify the population reliability
coefficient ("omega") of each set of
indicators. The population standardized factor
loadings are then computed to ensure
that the population reliability
coefficient is of the target value.
These are examples for a single group model:
number of indicator = c(m = 3, x = 4, y = 5)
The numbers of indicators for m,
x, and y are 3, 4, and 5,
respectively.
reliability = c(m = .90, x = .80, y = .70)
The population reliability
coefficients of m, x, and y are
.90, .80, and .70, respectively.
Multigroup models are supported.
The number of groups is inferred
from pop_es (see the help page
of ptable_pop()), or directly
from ptable.
For a multigroup model, the number of indicators for each variable must be the same across groups.
However, the population reliability
coefficients can be different
across groups. For a multigroup model
of k groups,
with one or more population reliability
coefficients differ across groups,
the argument reliability should be
set to a named list. The names are
the variables to which the population
reliability coefficients are to be
set. The element for each name is
either a single value for the common
reliability coefficient, or a
numeric vector of the reliability
coefficient of each group.
This is an example of reliability
for a model with 2 groups:
reliability = list(x = .80, m = c(.70, .80))
The reliability coefficients of x are
.80 in all groups, while the
reliability coefficients of m are
.70 in one group and .80 in another.
If all variables in the model has
the same number of indicators,
number_of_indicators can be set
to one single value.
Similarly, if all sets of indicators
have the same population reliability
in all groups, reliability can also
be set to one single value.
By default, variables and error
terms are generated
from a multivariate normal distribution.
If desired, users can supply the
function used to generate an exogenous
variable and error term by setting x_fun to
a named list.
The names of the list are the variables for which a user function will be used to generate the data.
Each element of the list must also be a list. The first element of this list, can be unnamed, is the function to be used. If other arguments need to be supplied, they should be included as named elements of this list.
For example:
x_fun = list(x = list(power4mome::rexp_rs),
w = list(power4mome::rbinary_rs,
p1 = .70)))
The variables x and w will be
generated by user-supplied functions.
For x, the function is
power4mome::rexp_rs. No additional
argument when calling this function.
For w, the function is
power4mome::rbinary_rx. The argument
p1 = .70 will be passed to this
function when generating the values
of w.
If a variable is an endogenous
variable (e.g., being predicted by
another variable in a model), then
x_fun is used to generate its
error term. Its implied population
distribution may still be different
from that generate by x_fun because
the distribution also depends on the
distribution of other variables
predicting it.
These are requirements for the user-functions:
They must return a numeric vector.
They mush has an argument n for
the number of values.
The population mean and standard deviation of the generated values must be 0 and 1, respectively.
The package power4mome has
helper functions for generating
values from some common nonnormal
distributions and then scaling them
to have population mean and standard
deviation equal to 0 and 1 (by default), respectively.
These are some of them:
To use x_fun, the variables must
have zero covariances with other
variables in the population. It is
possible to generate nonnormal
multivariate data but we believe this
is rarely needed when estimating
power before having the data.
For a single-group model, model
should be a lavaan model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es.
If the model has latent factors,
the syntax in model should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m, x, and y denote
observed variables or latent factors
are determined by other functions,
such as power4test().
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model is
specified as in the case of single
group models.
The argument pop_es is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
If pop_es is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan names for the selected
parameters. For example, m ~ x
denotes the path from x to m.
Alternatively, the names can be
either ".beta." or ".cov.".
Use ".beta." to set the default
values for all regression coefficients.
Use ".cov." to set the default
values for all correlations of
exogenous variables (e.g., predictors).
The names can also be of this form:
".ind.(<path>)", whether <path>
denote path in the model. For
example, ".ind.(x->m->y)" denotes
the path from x through m to
y. Alternatively, the lavaan
symbol ~ can also be used:
".ind.(y~m~x)". This form is used
to set the indirect effect (standardized,
by default) along this path. The
value for this name will override
other settings.
If using lavaan names, can
specify more than one parameter
using +. For example, y ~ m + x
denotes the two paths from m and
x to y.
The value of each element can be
the label for the effect size: n
for nil, s for small, m for
medium, and l for large. The
value for each label is determined
by es1 and es2. See the section
on specifying these two arguments.
The value of pop_es can also be
set to a value, but it must be
quoted as a string, such as "y ~ x" = ".31".
This is an example:
c(".beta." = "s",
"m1 ~ x" = "-m",
"m2 ~ m1" = "l",
"y ~ x:w" = "s")
In this example,
All regression coefficients are
set to small (s) by default, unless
specified otherwise.
The path from x to m1 is
set to medium and negative (-m).
The path from m1 to m2 is set
to large (l).
The coefficient of the product
term x:w when predicting y is
set to small (s).
When setting an indirect effect to
a symbol (default: "si", "mi",
"li", with "i" added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20" will
throw an error.
The argument pop_es also supports multigroup
models.
For pop_es, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or
keywords such as .beta. and
.cov., like specifying the
population values for a single
group model.
The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m",
"y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x is medium (m) for
all groups, while the population
values for the path y ~ m are
small (s), medium (m), and large (l),
respectively.
When setting the argument pop_es,
instead of using a named vector
or named list for
pop_es, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml().
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with tag:.
tag can be the name of a
parameter, in lavaan model
syntax format.
For example, m ~ x
denotes the path from x to m.
A tag in lavaan model syntax can
specify more than one parameter
using +.
For example, y ~ m + x
denotes the two paths from m and
x to y.
Alternatively, the tag can be
either .beta. or .cov..
Use .beta. to set the default
values for all regression coefficients.
Use .cov. to set the default
values for all correlations of
exogenous variables (e.g., predictors).
After each tag is the value of the population value:
-nil for nil (zero),
s for small,
m for medium, and
l for large.
si, mi, and li for
small, medium, and large a
standardized indirect effect,
respectively.
Note: n cannot be used in this mode.
The
value for each label is determined
by es1 and es2 as described
in ptable_pop().
The value can also be
set to a numeric value, such as
.30 or -.30.
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small, while
the path from m to y is large.
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed
by two or more lines. Each line
starts with a hyphen - and the
value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x to m for the two
groups are 0 (nil) and
small (s), respectively.
Another equivalent way to specify
the values are using [], on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta. and .cov. can
also be used for multigroup models.
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
The multiline string is parsed by yaml::read_yaml().
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
The vector es1 is for correlations,
regression coefficients, and
indirect effect, and the
vector es2 is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es.
# Specify the model mod <- "m ~ x y ~ m + x" # Specify the population values es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate the simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) data_all# Specify the model mod <- "m ~ x y ~ m + x" # Specify the population values es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate the simulated datasets data_all <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) data_all
Combine the outputs
of sim_data(), fit_model(),
and optionally gen_mc() and/or
gen_boot() to one
single object.
sim_out(data_all, ...) ## S3 method for class 'sim_out' print(x, digits = 3, digits_descriptive = 2, fit_to_all_args = list(), ...)sim_out(data_all, ...) ## S3 method for class 'sim_out' print(x, digits = 3, digits_descriptive = 2, fit_to_all_args = list(), ...)
data_all |
The output of
|
... |
Named arguments of
objects to be added to each
replication under the element
|
x |
The |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
fit_to_all_args |
A named list
of arguments to be passed to
|
It merges into one object the output
of sim_data(), which is a list of
nrep simulated datasets,
fit_model(), which is a list of the
lavaan::sem() output for the nrep
datasets, and optionally the output
of gen_mc() or gen_boot(), which is a list of the
R sets of Monte Carlo or bootstrap estimates
based on the results of
fit_model(). The list has nrep
elements, each element with the data,
the model fit
results, and optionally the Monte
Carlo estimates matched.
This object can then be used for testing effects of interests, which are further processed to estimate the power of this test.
The function sim_out() is used by
the all-in-one function
power4test(). Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
The function sim_out() returns a sim_out object, which
is a list of length equal to the
length of data_all. Each element
of the list is a sim_data object
with the element extra added to
it. Other named elements will be
added under this name. For example.
the output of fit_model()
for this replication can be added
to fit, under extra. See
the description of the argument
... for details.
The print method of sim_out
returns x invisibly. Called for
its side effect.
# Specify the model mod <- "m ~ x y ~ m + x" # Specify the population values es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate the simulated datasets dats <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each dataset fits <- fit_model(dats) # Combine the results to one object sim_out_all <- sim_out(data_all = dats, fit = fits) sim_out_all # Verify that the elements of fits are set to extra$fit library(lavaan) parameterEstimates(fits[[1]]) parameterEstimates(sim_out_all[[1]]$extra$fit) parameterEstimates(fits[[2]]) parameterEstimates(sim_out_all[[2]]$extra$fit)# Specify the model mod <- "m ~ x y ~ m + x" # Specify the population values es <- " y ~ m: m m ~ x: m y ~ x: n " # Generate the simulated datasets dats <- sim_data(nrep = 5, model = mod, pop_es = es, n = 100, iseed = 1234) # Fit the population model to each dataset fits <- fit_model(dats) # Combine the results to one object sim_out_all <- sim_out(data_all = dats, fit = fits) sim_out_all # Verify that the elements of fits are set to extra$fit library(lavaan) parameterEstimates(fits[[1]]) parameterEstimates(sim_out_all[[1]]$extra$fit) parameterEstimates(fits[[2]]) parameterEstimates(sim_out_all[[2]]$extra$fit)
Extract and summarize test results.
summarize_tests( object, collapse = c("none", "all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, merge_all_tests = FALSE, p_adjust_method = "none", alpha = 0.05 ) ## S3 method for class 'test_summary_list' print(x, digits = 3, ...) ## S3 method for class 'test_summary' print(x, digits = 2, ...) ## S3 method for class 'test_out_list' print(x, digits = 3, test_long = TRUE, ...)summarize_tests( object, collapse = c("none", "all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, merge_all_tests = FALSE, p_adjust_method = "none", alpha = 0.05 ) ## S3 method for class 'test_summary_list' print(x, digits = 3, ...) ## S3 method for class 'test_summary' print(x, digits = 2, ...) ## S3 method for class 'test_out_list' print(x, digits = 3, test_long = TRUE, ...)
object |
A |
collapse |
Whether a single
decision (significant vs. not significant)
is made across all tests for a test
that consists of several tests
(e.g., the tests of several parameters).
If |
at_least_k |
Used by |
merge_all_tests |
If |
p_adjust_method |
The method to be
passed to |
alpha |
The level of significance
to use when using |
x |
The object to be printed. |
digits |
The numbers of digits after the decimal when printing numeric results. |
... |
Optional arguments. Not used. |
test_long |
If |
The function summarize_tests()
is used to extract
information from each test stored
in a power4test object.
The method print.test_out_list() is
used to print the content of a list
of test stored in a power4test
object, with the option to print
just the names of tests.
The function summarize_tests() returns
a list of the class test_summary_list.
Each element contains a summary of a
test stored. The elements are of
the class test_summary, with
these elements:
test_attributes: The stored
information of a test, for printing.
nrep: The number of datasets
(replications).
mean: The means of numeric
information. For significance
tests, these are the rejection
rates.
nvalid: The number of non-NA
replications used to compute each
mean.
The print methods returns x invisibly.
They are called for their side
effects.
summarize_tests() and related functionsThe function summarize_tests() and
related print methods are used by
the all-in-one function
power4test() and its summary
method. Users usually do not
call them directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulated datasets sim_only <- power4test(nrep = 2, model = mod, pop_es = es, n = 100, do_the_test = FALSE, iseed = 1234) # Test the parameters in each dataset test_out <- power4test(object = sim_only, test_fun = test_parameters) # Print the summary summarize_tests(test_out)# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulated datasets sim_only <- power4test(nrep = 2, model = mod, pop_es = es, n = 100, do_the_test = FALSE, iseed = 1234) # Test the parameters in each dataset test_out <- power4test(object = sim_only, test_fun = test_parameters) # Print the summary summarize_tests(test_out)
The summary method of
the output of x_from_power().
## S3 method for class 'x_from_power' summary(object, ...) ## S3 method for class 'n_region_from_power' summary(object, ...) ## S3 method for class 'summary.x_from_power' print(x, digits = 3, ...) ## S3 method for class 'summary.n_region_from_power' print(x, digits = 3, ...)## S3 method for class 'x_from_power' summary(object, ...) ## S3 method for class 'n_region_from_power' summary(object, ...) ## S3 method for class 'summary.x_from_power' print(x, digits = 3, ...) ## S3 method for class 'summary.n_region_from_power' print(x, digits = 3, ...)
object |
An |
... |
Additional arguments. Not used for now. |
x |
The output of
|
digits |
The number of digits after the decimal when printing the results. |
The summary method simply prepares the
results of x_from_power()
to be printed in details.
The summary method for
x_from_power objects returns an
object of the class
summary.x_from_power, which is
simply the output of x_from_power(),
with a print method dedicated for
detailed summary. Please refer
to x_from_power() for the contents.
The print-method of summary.x_from_power
objects returns the object x
invisibly.
It is called for its side effect.
The print-method of summary.n_region_from_power
objects returns the object x
invisibly.
It is called for its side effect.
x_from_power(),
n_region_from_power()
# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 5, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 2345) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 5, max_trials = 1, seed = 1234) summary(power_vs_n)# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 5, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 2345) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 5, max_trials = 1, seed = 1234) summary(power_vs_n)
Test a conditional
indirect effect
for a power4test object.
test_cond_indirect( fit = fit, x = NULL, m = NULL, y = NULL, wvalues = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_cond_indirect( fit = fit, x = NULL, m = NULL, y = NULL, wvalues = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
wvalues |
A numeric vector of
named elements. The names are the
variable names of the moderators,
and the values are the values to
which the moderators will be set to.
Default is |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
test_method |
The method to do
the test. If |
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing a
conditional
indirect effect, by setting it
to the test_fun argument.
It uses manymome::cond_indirect()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm(), although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm().
It can also be used to test
a conditional effect on a direct path
with no mediator. Just omit m when
calling the function.
In its normal usage, it returns a named numeric vector with the following elements:
est: The mean of the estimated
indirect effect across datasets.
cilo and cihi: The means of the
lower and upper limits of the
confidence interval (95% by
default), respectively.
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
# Specify the model model_simple_mod <- " m ~ x + w + x:w y ~ m + x " # Specify the population values model_simple_mod_es <- " y ~ m: l y ~ x: n m ~ x: m m ~ w: n m ~ x:w: l " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_mod, pop_es = model_simple_mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_cond_indirect, test_args = list(x = "x", m = "m", y = "y", wvalues = c(w = 1), mc_ci = TRUE)) print(test_ind, test_long = TRUE)# Specify the model model_simple_mod <- " m ~ x + w + x:w y ~ m + x " # Specify the population values model_simple_mod_es <- " y ~ m: l y ~ x: n m ~ x: m m ~ w: n m ~ x:w: l " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_mod, pop_es = model_simple_mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_cond_indirect, test_args = list(x = "x", m = "m", y = "y", wvalues = c(w = 1), mc_ci = TRUE)) print(test_ind, test_long = TRUE)
Test several conditional
indirect effects
for a power4test object.
test_cond_indirect_effects( fit = fit, x = NULL, m = NULL, y = NULL, wlevels = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, compare_groups = FALSE, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_cond_indirect_effects( fit = fit, x = NULL, m = NULL, y = NULL, wlevels = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, compare_groups = FALSE, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
wlevels |
The output of
|
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
test_method |
The method to do
the test. If |
compare_groups |
If the model is a multigroup model, compute and test group differences for all pairwise combinations of the groups. Ignored if the model is a single-group model. |
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing several
conditional
indirect effects, by setting it
to the test_fun argument.
It uses manymome::cond_indirect_effects()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm(), although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm().
It can also be used to test
conditional effects on a direct path
with no mediator. Just omit m when
calling the function.
In its normal usage, it returns
the output returned by
manymome::cond_indirect_effects(),
with the following modifications:
est: The estimated
conditional indirect effects.
cilo and cihi: The
lower and upper limits of the
confidence interval (95% by
default), respectively.
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
test_label: A column of labels
generated to label the conditional
effects.
# Specify the model model_simple_mod <- " m ~ x + w + x:w y ~ m + x " # Specify the population values model_simple_mod_es <- " y ~ m: l y ~ x: n m ~ x: m m ~ w: n m ~ x:w: l " # Simulate the data # Set nrep to a larger value in real analysis, such as 400 sim_only <- power4test(nrep = 5, model = model_simple_mod, pop_es = model_simple_mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_cond_indirect_effects, test_args = list(x = "x", m = "m", y = "y", wlevels = c("w"), mc_ci = TRUE)) print(test_out, test_long = TRUE)# Specify the model model_simple_mod <- " m ~ x + w + x:w y ~ m + x " # Specify the population values model_simple_mod_es <- " y ~ m: l y ~ x: n m ~ x: m m ~ w: n m ~ x:w: l " # Simulate the data # Set nrep to a larger value in real analysis, such as 400 sim_only <- power4test(nrep = 5, model = model_simple_mod, pop_es = model_simple_mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_cond_indirect_effects, test_args = list(x = "x", m = "m", y = "y", wlevels = c("w"), mc_ci = TRUE)) print(test_out, test_long = TRUE)
Test the model fit change when one or more between-group constraints are imposed.
test_group_equal( fit = fit, group.equal = NULL, group.partial = NULL, check_post_check = TRUE, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_group_equal( fit = fit, group.equal = NULL, group.partial = NULL, check_post_check = TRUE, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object. Must be
the output of |
group.equal |
The same argument
used by |
group.partial |
The same argument
used by |
check_post_check |
Logical. If
|
... |
Optional arguments to be
passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing
the difference in model fit
when one or more between-group
constraints are imposed
, by
setting it to the test_fun
argument.
In its normal usage, it returns a one-row data frame with the following columns:
est: The chi-square difference.
cilo and cihi: NA. Not used.
sig: Whether the chi-square
difference test is significant
test_label: The constraints
imposted.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: - nil - s y ~ x: nil " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_group_equal, test_args = list(group.equal = "regressions")) print(test_out, test_long = TRUE)# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: - nil - s y ~ x: nil " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_group_equal, test_args = list(group.equal = "regressions")) print(test_out, test_long = TRUE)
Test a moderated
mediation effect for a power4test
object.
test_index_of_mome( fit = fit, x = NULL, m = NULL, y = NULL, w = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_index_of_mome( fit = fit, x = NULL, m = NULL, y = NULL, w = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
w |
The name of the moderator. |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
test_method |
The method to do
the test. If |
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing a
moderated mediation effect, by
setting it to the test_fun
argument.
It uses manymome::index_of_mome()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm(), although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm().
In its normal usage, it returns a named numeric vector with the following elements:
est: The mean of the estimated
indirect effect across datasets.
cilo and cihi: The means of the
lower and upper limits of the
confidence interval (95% by
default), respectively.
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
# Specify the model mod <- " m ~ x + w + x:w y ~ m " # Specify the population values mod_es <- " m ~ x: n y ~ x: m m ~ w: l m ~ x:w: l " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_out <- power4test(object = sim_only, test_fun = test_index_of_mome, test_args = list(x = "x", m = "m", y = "y", w = "w", mc_ci = TRUE)) print(test_out, test_long = TRUE)# Specify the model mod <- " m ~ x + w + x:w y ~ m " # Specify the population values mod_es <- " m ~ x: n y ~ x: m m ~ w: l m ~ x:w: l " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_out <- power4test(object = sim_only, test_fun = test_index_of_mome, test_args = list(x = "x", m = "m", y = "y", w = "w", mc_ci = TRUE)) print(test_out, test_long = TRUE)
Test an indirect effect
for a power4test object.
test_indirect_effect( fit = fit, x = NULL, m = NULL, y = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_indirect_effect( fit = fit, x = NULL, m = NULL, y = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
test_method |
The method to do
the test. If |
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing an
indirect effect, by setting it
to the test_fun argument.
It uses manymome::indirect_effect()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm(), although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm().
In its normal usage, it returns a named numeric vector with the following elements:
est: The mean of the estimated
indirect effect across datasets.
cilo and cihi: The means of the
lower and upper limits of the
confidence interval (95% by
default), respectively.
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
Asparouhov, A., & Muthén, B. (2021). Bootstrap p-value computation. Retrieved from https://www.statmodel.com/download/FAQ-Bootstrap%20-%20Pvalue.pdf
# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", mc_ci = TRUE)) print(test_ind, test_long = TRUE)# Specify the model model_simple_med <- " m ~ x y ~ m + x " # Specify the population values model_simple_med_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_indirect_effect, test_args = list(x = "x", m = "m", y = "y", mc_ci = TRUE)) print(test_ind, test_long = TRUE)
Test several indirect effects
for a power4test object.
test_k_indirect_effects( fit = fit, x = NULL, m = NULL, y = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig", "total"), at_least_k = 1, p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_k_indirect_effects( fit = fit, x = NULL, m = NULL, y = NULL, mc_ci = TRUE, mc_out = NULL, boot_ci = FALSE, boot_out = NULL, check_post_check = TRUE, test_method = NULL, ..., omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig", "total"), at_least_k = 1, p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
Must be a list of character
vectors. Each character vector stores the
name(s) of mediator(s) along a path.
The path
moves from the first mediator in the
vector to the last mediator in the
vector. If |
y |
The name of the |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
test_method |
The method to do
the test. If |
... |
Additional arguments to
be passed to |
omnibus |
If |
at_least_k |
The minimum number
of paths required to be significant
for the omnibus test to be considered
significant. Used when
|
p_adjust_method |
The method to be
passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing an
indirect effect, by setting it
to the test_fun argument.
It uses manymome::many_indirect_effects()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm(), although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm().
It also supports testing the total
indirect effect. Just set omnibus
to "total".
In its normal usage, it returns a data frame with the following columns:
est: The estimated
indirect effect for each path.
cilo and cihi: The
lower and upper limits of the
confidence interval (95% by
default), respectively,
for each indirect effect
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
test_label: A column of labels
generated to label the indirect
effects.
If omnibus is "all_sig" or
"at_least_one"sig", then
the data frame has only one row,
and the columns "est", "cilo",
and "cihi" are NA. The column
sig is determined by whether
all paths are significant ("all_sig")
or whether at least one path is
significant ("at_least_one_sig").
# Specify the model model_simple_med <- " m1 ~ x m2 ~ x y ~ m1 + m2 + x " # Specify the population values model_simple_med_es <- " y ~ m1: s m1 ~ x: m y ~ m2: s m2 ~ x: l y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_k_indirect_effects, test_args = list(x = "x", y = "y", mc_ci = TRUE)) print(test_ind, test_long = TRUE) # Set omnibus = "all_sig" to declare # significant only if all paths are # significant test_ind_all_sig <- power4test( object = sim_only, test_fun = test_k_indirect_effects, test_args = list(x = "x", y = "y", mc_ci = TRUE, omnibus = "all_sig")) print(test_ind_all_sig, test_long = TRUE)# Specify the model model_simple_med <- " m1 ~ x m2 ~ x y ~ m1 + m2 + x " # Specify the population values model_simple_med_es <- " y ~ m1: s m1 ~ x: m y ~ m2: s m2 ~ x: l y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 5, model = model_simple_med, pop_es = model_simple_med_es, n = 100, R = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_ind <- power4test(object = sim_only, test_fun = test_k_indirect_effects, test_args = list(x = "x", y = "y", mc_ci = TRUE)) print(test_ind, test_long = TRUE) # Set omnibus = "all_sig" to declare # significant only if all paths are # significant test_ind_all_sig <- power4test( object = sim_only, test_fun = test_k_indirect_effects, test_args = list(x = "x", y = "y", mc_ci = TRUE, omnibus = "all_sig")) print(test_ind_all_sig, test_long = TRUE)
Test all moderation
effects by testing all product
terms for a power4test object.
test_moderation( fit = fit, standardized = FALSE, check_post_check = TRUE, ..., p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )test_moderation( fit = fit, standardized = FALSE, check_post_check = TRUE, ..., p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE )
fit |
The fit object, to be
passed to |
standardized |
Logical. If |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
p_adjust_method |
The method to be
passed to |
fit_name |
The name of the fit
results for which the parameter names
will be displayed. Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
This function is to be used in
power4test() for testing all
product terms, by
setting it to the test_fun
argument.
It is just a wrapper to
test_parameters(). It will
first identifies all product
terms (terms with : in the names),
and then call test_parameters(),
with pars set to select the
regression coefficients of these
terms.
In its normal usage, it returns
the output returned by
lavaan::parameterEstimates()
or lmhelprs::lm_list_to_partable(),
with the following modifications:
est: The parameter estimates,
even if standardized estimates
are requested (not est.std).
cilo and cihi: The
lower and upper limits of the
confidence interval (95% by
default), respectively (not
ci.lower and ci.upper).
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
test_label: A column of labels
generated by
lavaan::lav_partable_labels(),
which are usually the labels used by
coef() to label the parameters.
power4test(),
test_parameters()
# Specify the model mod <- " m ~ x + w1 + x:w1 y ~ m + x " # Specify the population values mod_es <- " m ~ x: n y ~ x: m m ~ w1: n m ~ x:w1: l " # Simulate the data sim_only <- power4test(nrep = 4, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_out <- power4test(object = sim_only, test_fun = test_moderation) print(test_out, test_long = TRUE)# Specify the model mod <- " m ~ x + w1 + x:w1 y ~ m + x " # Specify the population values mod_es <- " m ~ x: n y ~ x: m m ~ w1: n m ~ x:w1: l " # Simulate the data sim_only <- power4test(nrep = 4, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do the test in each replication test_out <- power4test(object = sim_only, test_fun = test_moderation) print(test_out, test_long = TRUE)
Test all free parameters,
including user-defined parameters,
for a power4test object.
test_parameters( fit = fit, standardized = FALSE, pars = NULL, op = NULL, remove.nonfree = TRUE, check_post_check = TRUE, exclude_var = FALSE, compare_groups = FALSE, ..., omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE ) find_par_names(object, fit_name = "fit")test_parameters( fit = fit, standardized = FALSE, pars = NULL, op = NULL, remove.nonfree = TRUE, check_post_check = TRUE, exclude_var = FALSE, compare_groups = FALSE, ..., omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig"), at_least_k = 1, p_adjust_method = "none", fit_name = "fit", get_map_names = FALSE, get_test_name = FALSE ) find_par_names(object, fit_name = "fit")
fit |
The fit object, to be
passed to |
standardized |
Logical. If |
pars |
Optional. If set to
a character vector, only parameters
with |
op |
Optional. If set to a
character vector, only parameters with
operators (e.g., |
remove.nonfree |
Logical. If
|
check_post_check |
Logical. If
|
exclude_var |
Logical. If |
compare_groups |
Logical. If |
... |
Additional arguments to
be passed to |
omnibus |
If |
at_least_k |
The minimum number
of paths required to be significant
for the omnibus test to be considered
significant. Used when
|
p_adjust_method |
The method to be
passed to |
fit_name |
The name of the fit
results for which the parameter names
will be displayed. Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
object |
A |
This function is to be used in
power4test() for testing all
free and user-defined model
parameters, by
setting it to the test_fun
argument.
For models fitted by lavaan,
it uses lavaan::parameterEstimates()
to do the test. If bootstrapping was
requested (by setting se = "boot"),
then it supports bootstrap
confidence intervals returned by
lavaan::parameterEstimates().
It has preliminary, though limited,
supported for models fitted by
stats::lm() (through
lmhelprs::many_lm()). Tests are
conducted by ordinary least squares
confidence intervals based on
the t statistic, reported by
stats::confint() applied to
the output of stats::lm().
In its normal usage, it returns
the output returned by
lavaan::parameterEstimates()
or lmhelprs::lm_list_to_partable(),
with the following modifications:
est: The parameter estimates,
even if standardized estimates
are requested (not est.std).
cilo and cihi: The
lower and upper limits of the
confidence interval (95% by
default), respectively (not
ci.lower and ci.upper).
sig: Whether a test by confidence
interval is significant (1) or
not significant (0).
test_label: A column of labels
generated by
lavaan::lav_partable_labels(),
which are usually the labels used by
coef() to label the parameters.
To use the argument pars, the
names as appeared in the function
coef() must be used. For the
output of lavaan, this can
usually be inferred from the
parameter syntax (e.g., y~x,
no space). If not sure, call
coef() on the output of lavaan.
If a parameter is labelled, then
the label should be used in par.
If not sure, the function
find_par_names() can be used to
find valid names.
# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_parameters) print(test_out, test_long = TRUE) # Do the tests in each replication: Standardized solution # Delta method SEs will be used to do the tests test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(standardized = TRUE)) print(test_out, test_long = TRUE) # Do the tests in each replication: Parameters with the selected operator test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(op = "~")) print(test_out, test_long = TRUE) # Finding valid parameter names find_par_names(sim_only)# Specify the model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " y ~ m: l m ~ x: m y ~ x: n " # Simulate the data sim_only <- power4test(nrep = 2, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 1234) # Do the tests in each replication test_out <- power4test(object = sim_only, test_fun = test_parameters) print(test_out, test_long = TRUE) # Do the tests in each replication: Standardized solution # Delta method SEs will be used to do the tests test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(standardized = TRUE)) print(test_out, test_long = TRUE) # Do the tests in each replication: Parameters with the selected operator test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(op = "~")) print(test_out, test_long = TRUE) # Finding valid parameter names find_par_names(sim_only)
It searches by simulation the sample size (given other factors, such as effect sizes) or effect size (given other factors, such as sample size) with power to detect an effect close to a target value.
x_from_power( object, x = arg_x_from_power(object, "x", arg_in = "call") %||% "n", pop_es_name = arg_x_from_power(object, "pop_es_name", arg_in = "call"), target_power = 0.8, what = arg_x_from_power(object, "what") %||% "point", goal = arg_x_from_power(object, "goal") %||% { switch(what, point = "ci_hit", ub = "close_enough", lb = "close_enough") }, ci_level = 0.95, tolerance = NULL, x_interval = switch(x, n = c(50, 2000), es = NULL), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = attr(object, "args")$nrep %||% (object$nrep_final %||% 400), final_R = attr(object, "args")$R %||% (object$args$final_R %||% 1000), seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) n_from_power( object, pop_es_name = NULL, target_power = 0.8, what = formals(x_from_power)$what, goal = formals(x_from_power)$goal, ci_level = 0.95, tolerance = NULL, x_interval = c(50, 2000), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = formals(x_from_power)$final_nrep, final_R = formals(x_from_power)$final_R, seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) n_region_from_power( object, pop_es_name = NULL, target_power = 0.8, ci_level = 0.95, tolerance = NULL, x_interval = c(50, 2000), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = formals(x_from_power)$final_nrep, final_R = formals(x_from_power)$final_R, seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) ## S3 method for class 'x_from_power' print(x, digits = 3, call = TRUE, ...) ## S3 method for class 'n_region_from_power' print(x, digits = 3, ...) arg_x_from_power(object, arg, arg_in = NULL) pba_diagnosis( a_out, p_interval = c(0.05, 0.95), posterior_xlim = c(0.01, 0.99), which = NULL )x_from_power( object, x = arg_x_from_power(object, "x", arg_in = "call") %||% "n", pop_es_name = arg_x_from_power(object, "pop_es_name", arg_in = "call"), target_power = 0.8, what = arg_x_from_power(object, "what") %||% "point", goal = arg_x_from_power(object, "goal") %||% { switch(what, point = "ci_hit", ub = "close_enough", lb = "close_enough") }, ci_level = 0.95, tolerance = NULL, x_interval = switch(x, n = c(50, 2000), es = NULL), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = attr(object, "args")$nrep %||% (object$nrep_final %||% 400), final_R = attr(object, "args")$R %||% (object$args$final_R %||% 1000), seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) n_from_power( object, pop_es_name = NULL, target_power = 0.8, what = formals(x_from_power)$what, goal = formals(x_from_power)$goal, ci_level = 0.95, tolerance = NULL, x_interval = c(50, 2000), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = formals(x_from_power)$final_nrep, final_R = formals(x_from_power)$final_R, seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) n_region_from_power( object, pop_es_name = NULL, target_power = 0.8, ci_level = 0.95, tolerance = NULL, x_interval = c(50, 2000), extendInt = NULL, progress = TRUE, simulation_progress = NULL, max_trials = NULL, final_nrep = formals(x_from_power)$final_nrep, final_R = formals(x_from_power)$final_R, seed = NULL, x_include_interval = FALSE, check_es_interval = TRUE, power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL, upper_bound = NULL, nls_control = list(), nls_args = list()), save_sim_all = FALSE, algorithm = NULL, control = list(), internal_args = list(), rejection_rates_args = list(collapse = "all_sig", at_least_k = 1, p_adjust_method = "none", alpha = 0.05) ) ## S3 method for class 'x_from_power' print(x, digits = 3, call = TRUE, ...) ## S3 method for class 'n_region_from_power' print(x, digits = 3, ...) arg_x_from_power(object, arg, arg_in = NULL) pba_diagnosis( a_out, p_interval = c(0.05, 0.95), posterior_xlim = c(0.01, 0.99), which = NULL )
object |
A |
x |
For |
pop_es_name |
The name of the
parameter. Required if |
target_power |
The target power, a value greater than 0 and less than one. |
what |
The value for which is
searched: the estimate power ( |
goal |
The goal of the search.
If |
ci_level |
The level of confidence of the confidence intervals computed for the estimated power. Default is .95, denoting 95%. |
tolerance |
Used when the goal
is |
x_interval |
A vector of
two values, the minimum value
and the maximum values of |
extendInt |
Whether |
progress |
Logical. Whether the searching progress is reported. |
simulation_progress |
Logical.
Whether the progress in each call
to |
max_trials |
The maximum number
of trials in searching the value
with the target power. Rounded
up if not an integer.
If |
final_nrep |
The number of
replications in the final stage,
also the maximum number of replications
in each call to |
final_R |
The number of
Monte Carlo simulation or
bootstrapping samples in the final
stage. The |
seed |
If not |
x_include_interval |
Logical.
Whether the minimum and maximum
values in |
check_es_interval |
If |
power_curve_args |
A named
list of arguments to be passed
|
save_sim_all |
If |
algorithm |
The algorithm for
finding |
control |
A named list of additional arguments to be passed to the algorithm to be used. For advanced users. |
internal_args |
A named list of internal arguments. For internal testing. Do not use it. |
rejection_rates_args |
Argument values to be used when
|
digits |
The number of digits after the decimal when printing the results. |
call |
Logical. Whether the call is printed. |
... |
Optional arguments. Not used for now. |
arg |
The name of element to retrieve. |
arg_in |
The name of the element from which an element is to be retrieved. |
a_out |
The output of
|
p_interval |
The range of the plot that "zooms" around the solution (or the median of in the final posterior probability distribution), expressed in terms of the area of the distribution. |
posterior_xlim |
The range of the first plot of search history and the plot of posterior probability distribution, expressed in terms of the area of the distribution. |
which |
If |
This is how to use x_from_power():
Specify the model by power4test(),
with do_the_test = FALSE, and set
the magnitude of the effect sizes
to the minimum levels to detect.
Add the test using power4test()
using test_fun and test_args
(see the help page of power4test()
for details). Run it on the
starting sample size or
effect size.
Call x_from_power() on the output
of power4test() returned from
the previous step. This
function will iteratively repeat
the analysis on either other sample
sizes, or other values for a
selected model parameter (the
effect sizes),
trying to achieve a goal (goal) for
a value of interest (what).
If the goal is "ci_hit", the
search will try to find a value (a sample
size, or a population value of
the selected model parameter) with
a power level close enough to the
target power, defined by having its
confidence interval for the power
including the target power.
If the goal is "close_enough",
then the search will try to find a
value of x with its level of
power ("point"), the upper bound
of the confidence interval for this
level of power ("ub"), or the
lower bound of the confidence interval
fro this level of power ("lb")
"close enough" to the target level of
power, defined by having an absolute
difference less than the tolerance.
If several values of x (sample
size or the population value of
a model parameter) have already been
examined by power4test_by_n() or
power4test_by_es(), the output
of these two functions can also be
used as object by x_from_power().
Usually, the default values of the arguments should be sufficient.
The results can be viewed using
summary(), and the output has
a plot method (plot.x_from_power()) to
plot the relation between power and
values (of x) examined.
A detailed illustration on how to use this function for sample size can be found from this page:
https://sfcheung.github.io/power4mome/articles/x_from_power_for_n.html
The function n_from_power() is just
a wrapper of x_from_power(), with
x set to "n".
The function n_region_from_power() is just
a wrapper of x_from_power(), with
x set to "n", with two passes, one
with what = "ub" and one with
what = "lb".
The print method only prints
basic information. Call the
summary method of x_from_power objects
(summary.x_from_power()) and its
print method for detailed results
The function arg_x_from_power()
is a helper to set argument values
if object is an output
of x_from_power() or similar
functions.
The function pba_diagnosis()
generates simple diagnostic plots
for the search history of probabilistic
bisection algorithm. This function is for advanced users
to examine the search history of
the probabilistic bisection algorithm.
It is for diagnostic purpose and has
limited support for customizing the plots.
The function x_from_power()
returns an x_from_power object,
which is a list with the following
elements:
power4test_trials: The output of
power4test_by_n() for all sample
sizes examined, or of
power4test_by_es() for all
population values of the selected
parameter examined.
rejection_rates: The output of
rejection_rates().
x_tried: The sample sizes or
population values
examined.
power_tried: The estimated
rejection rates for all the values
examined.
x_final: The sample size or
population value in the
solution. NA if a solution not found.
power_final: The estimated power
of the value in the solution.
NA if a solution not found.
i_final: The position of the
solution in power4test_trials.
NA if a solution not found.
ci_final: The confidence interval
of the estimated power in the solution.
The method is determined
by the option power4mome.ci_method.
If NULL or "wilson", Wilson's
(1927) method is used. If
"norm", normal approximation
is used.
ci_level: The level of confidence
of ci_final.
nrep_final: The number of
replications (nrep) when estimating
the power in the solution.
power_curve: The output of
power_curve() when estimating the
power curve.
target_power: The requested
target power.
power_tolerance: The allowed
difference between the solution's
estimated power and the target
power. Determined by the number
of replications and the level of
confidence of the confidence intervals.
x_estimated: The value
(sample size or population value)
with the target power, estimated by
power_curve. This is used, when
solution not found, to determine the
range of the values to search when
calling the function again.
start: The time and date when
the process started.
end: The time and date when the
process ended.
time_spent: The time spent in
doing the search.
args: A named list of the arguments
of x_from_power() used in the search.
call: The call when this function
is called.
The function n_region_from_power()
returns a named list of two output of
n_from_power(), of the class
n_region_from_power. The output
with what = "ub" is named "below",
and the output with what = "lb" is
namd "above".
The print-method of x_from_power
objects returns the object x
invisibly.
It is called for its side effect.
The print-method of x_from_power_region
objects returns the object x
invisibly.
It is called for its side effect.
The function arg_x_from_power()
returns the requested argument if
available. If not available, it
returns NULL.
The function pba_diagnosis()
returns NULL invisibly. Called for
its side-effect.
Three algorithms are currently available, the simple (though sometimes inefficient) bisection method, a method that makes use of the estimated crude power curve, and probabilistic bisection algorithm (Waeber et al., 2013; see Chalmers, 2024, for applying this algorithm to power analysis).
Unlike typical root-finding problems,
the prediction of the level of power
is stochastic. Moreover, the computational
cost is high when Monte Carlo or
bootstrap confidence intervals are
used to do a test because the estimation
of the power for one single value of
x can sometimes take one minute or
longer. Therefore, in addition to
the simple bisection method, two methods,
named power curve method
(which belongs to the family of
surrogate function approximation method
reviewed in Chalmers, 2024) and
probabilistic bisection method
(Chalmers, 2024; Waeber et al., 2013) were also
developed for this
scenario.
This method (called informat bisection
in Chalmers, 2024), enabled by
algorithm = "bisection",
basically starts with
an interval that probably encloses the
value of x that meets the goal,
and then successively narrows this
interval. A point inside this
interval, usually the mid-point but
can also be approximated by another
method (e.g., the power curve method),
is used as the estimate.
Though simple, there are cases in
which it can be slow. Nevertheless,
preliminary examination suggests that
this method is good enough for
scenarios in which only an approximate
sample size or range of sample sizes is
needed.
The internal workflow of
this method implemented in
x_from_power() can be found in
this technical vignette: https://sfcheung.github.io/power4mome/articles/x_from_power_workflow_bisection.html.
This method (belongs to the
surrogate function approximation family
reviewed in Chalmers, 2024),
enabled by algorithm = "power_curve",
starts with a crude
power curve based on a few points.
This tentative model is then used
to suggest the values to examine in
the next iteration. Unlike some other
implementations
of this family of methods, the form, not
just the parameters, of the
model can change across iterations,
as more and more data points are
available.
This method is the default method
for some scenarios, such as
x = "es" with goal = "ci_hit"
because the relation
between the power and the population
value of a parameter varies across
parameters, unlike the relation
between power and sample size, which
is monotonic. Therefore,
taking into account the working
power curve may help finding the
desired value of x.
Before version 0.1.1.33, this
method can be used only with
the goal "ci_hit". Since
version 0.1.1.34, it supports all
goals, like the bisection method.
The internal workflow of
this method implemented in
x_from_power() can be found in
this technical vignette: https://sfcheung.github.io/power4mome/articles/x_from_power_workflow_power_curve.html.
This method, proposed by Waeber and others (2013) for stochastic root-solving, has been adapted by Chalmers (2024) for power analysis. Similar to bisection, this method starts with an interval, with a initial probability for each value (or range of values), such as sample sizes, as the sample size with the target power. A value is then selected to estimate power (the median, by default). Based on the estimated power for this value, the distribution of probabilities is updated. This process is repeated until some termination criteria are met.
Unlike bisection, each iteration can be conducted with a small number of replications (e.g., 50). The method accumulate evidence in a Bayesian approach, and so the certainty of the solution is based on the accumulation of evidence from successive iterations, not on having strong evidence from a few iterations.
In x_from_power, the version
of probabilistic bisection proposed
by Waeber et al. (2013) was implemented,
with minor changes.
Most importantly, when some termination
criteria are met, the candidate value
of x (e.g., a sample size) will be
checked using a larger number of
replications (set by final_nrep)
to ensure that the estimated power
is indeed close enough to the target
power (based on tolerance value,
determined internally but can also be
set directly). If yes, it will be
returned as the solution. If no, then
the search will continue, until a
maximum number of candidate values
has been checked.
Although the bisection method can be fast in some situations (e.g., when the interval is narrow and the solution happens to be inside the interval), the probabilistic bisection method is nearly guaranteed to converge to the solution (as long as the solution is inside the interval). Therefore, this is the default algorithm in some scenarios.
The internal workflow of
this method implemented in
x_from_power() can be found in
this technical vignette: https://sfcheung.github.io/power4mome/articles/x_from_power_workflow_pba.html.
Chalmers, R. P. (2024). Solving variables with Monte Carlo simulation experiments: A stochastic root-solving approach. Psychological Methods. Advance online publication. doi:10.1037/met0000689
Waeber, R., Frazier, P. I., & Henderson, S. G. (2013). Bisection search with noisy responses. SIAM Journal on Control and Optimization, 51(3), 2261–2279. doi:10.1137/120861898
Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association, 22(158), 209-212. doi:10.1080/01621459.1927.10502953
power4test(), power4test_by_n(),
and power4test_by_es().
# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 5, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 2345) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) # In real analysis, to have more stable results: # - Use a larger final_nrep (e.g., 400). # If the default values are OK, this call is sufficient: # power_vs_n <- x_from_power(test_out, # x = "n", # seed = 4567) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 5, max_trials = 1, seed = 1234) summary(power_vs_n) plot(power_vs_n)# Specify the population model mod <- " m ~ x y ~ m + x " # Specify the population values mod_es <- " m ~ x: m y ~ m: l y ~ x: n " # Generate the datasets sim_only <- power4test(nrep = 5, model = mod, pop_es = mod_es, n = 100, do_the_test = FALSE, iseed = 2345) # Do a test test_out <- power4test(object = sim_only, test_fun = test_parameters, test_args = list(pars = "m~x")) # Determine the sample size with a power of .80 (default) # In real analysis, to have more stable results: # - Use a larger final_nrep (e.g., 400). # If the default values are OK, this call is sufficient: # power_vs_n <- x_from_power(test_out, # x = "n", # seed = 4567) power_vs_n <- x_from_power(test_out, x = "n", progress = TRUE, target_power = .80, final_nrep = 5, max_trials = 1, seed = 1234) summary(power_vs_n) plot(power_vs_n)