Skip to Main Content
782
Views
47
CrossRef citations to date
Altmetric

ARTICLE

Constructing Approximate Confidence Intervals for Parameters With Structural Equation Models

Pages 267-294
Published online: 15 Apr 2009

Confidence intervals (CIs) for parameters are usually constructed based on the estimated standard errors. These are known as Wald CIs. This article argues that likelihood-based CIs (CIs based on likelihood ratio statistics) are often preferred to Wald CIs. It shows how the likelihood-based CIs and the Wald CIs for many statistics and psychometric indexes can be constructed with the use of phantom variables (Rindskopf, 1984) in some of the current structural equation modeling (SEM) packages. The procedures to form CIs for the differences in correlation coefficients, squared multiple correlations, indirect effects, coefficient alphas, and reliability estimates are illustrated. A simulation study on the Pearson correlation is used to demonstrate the advantages of the likelihood-based CI over the Wald CI. Issues arising from this SEM approach and extensions of this approach are discussed.

Null hypothesis significance testing (NHST) has dominated statistical analyses in the social and behavioral sciences for decades. However, there is growing opposition to this state of affairs. Much of the opposition comes from the work of applied statisticians (e.g., Cohen, 1994; Harlow, Mulaik, & Steiger, 1997; Schmidt, 1996; among others), who have severely criticized the abuse and misuse of NHST as nonconstructive to advances in psychological research. Although NHST has its defenders (e.g., Chow, 1988; Nickerson, 2000; Wainer, 1999), an increasing number of scientists have found it difficult to accept that the soundness of scientific findings is evaluated with a binary decision based on comparing a p value to an arbitrary criterion such as .05 (Wilkinson & Task Force on Statistical Inference, 1999). Responding to this issue, the American Psychological Association (APA) Task Force on Statistical Inference (Wilkinson & Task Force on Statistical Inference, 1999) suggested that a better approach would be to report an actual p value or a confidence interval (CI). This article focuses on CIs only.

Instead of deciding whether or not a parameter estimate is statistically significant, CIs provide information on the precision of the parameter estimate. Recommendations on using CIs are generally observed in a variety of disciplines in the behavioral sciences, for example, psychology (APA, 2001; Wilkinson & Task Force on Statistical Inference, 1999), the medical sciences (International Committee of Medical Journal Editors, 2004), and neurophysiology (Curran-Everett & Benos, 2004).

A substantial number of recent articles describe how to construct CIs for different statistics and psychometric indexes; e.g., correlations (Algina, 1999; Olkin & Finn, 1995), effect sizes for the analysis of variance (ANOVA; Cumming & Finch, 2001; Steiger, 2004; Steiger & Fouladi, 1997), indirect effects (Cheung, in press; Cheung, 2007; MacKinnon, Lockwood, & Williams, 2004), coefficient alphas (Duhachek & Iacobucci, 2004), and reliability estimates (Raykov, 1998, 2002).

As Steiger and Fouladi (1997) commented, one of the reasons why CIs are seldom reported is that statistical packages and methods to construct CIs for different statistics and psychometric indexes are lacking in availability. Based on the noncentral F distribution, Steiger and Fouladi (1997; Steiger, 2004) proposed a general approach to constructing exact CIs in the context of ANOVA. Their method can be used to construct CIs for many effect sizes, such as R 2, the root mean square standardized effect (RMSSE), and standardized contrasts.

It is well known that many models can be formulated in structural equation modeling (SEM), of which the following are only a few: regression analyses, path analyses, confirmatory factor analyses (e.g., Bentler, 1995; Jöreskog & Sörbom, 1996), ANOVA to multivariate analysis of covariance (MANCOVA; Bagozzi & Yi, 1999; Cole, Maxwell, Arvey, & Salas, 1993), growth curve modeling (Willett & Sayer, 1994), SEM based meta-analysis (Cheung, 2008), meta-analytic SEM (Cheung & Chan, 2005), and multilevel modeling (Bauer, 2003; Curran, 2003; Mehta & Neale, 2005). Many research hypotheses involving covariances and correlations can be easily formulated in SEM (Cheung & Chan, 2004; Raykov, 2001b). This suggests that SEM can be used as a general model for many statistical analyses in behavioral research.

There are two objectives to this study. First, to argue that likelihood-based CIs (CIs based on the likelihood ratio statistics) are often preferred to Wald CIs (CIs based on the estimated standard errors [s]). Second, to show how SEM can be used as a general, but simple, framework to construct the likelihood-based and Wald CIs on a variety of statistics and psychometric indexes with the use of phantom variables (Rindskopf, 1984). Several examples of constructing CIs with the SEM approach are illustrated with a real data set. A simulation study on the Pearson correlation is then used to demonstrate the differences between the likelihood-based CI and the Wald CI.

METHODS OF CONSTRUCTING CIs

A 100(1 − α)% CI on a parameter (>) is a random interval, calculated from the sample, that contains > with the prespecified probability in the long run; that is, where α is the significance level, and Lower and Upper are the estimates of the lower and upper limits on >, respectively (e.g., Rice, 1995). One common method of constructing CIs is by inverting a test statistic with a known distribution (e.g., Casella & Berger, 2002; Hahn & Meeker, 1991). This inversion confidence interval principle has been frequently used to construct CIs for effect sizes (see Cumming & Finch, 2001; Steiger, 2004; Steiger & Fouladi, 1997). Wald and likelihood-based CIs can be constructed by this principle.

Wald CIs

Maximum likelihood (ML) estimation, developed by R. A. Fisher in the 1920s, is probably the most common estimation method used in statistics. Under some regularity conditions (e.g., Azzalini, 1996, p. 71), ML estimators have many desirable properties. For instance, they are consistent, asymptotically unbiased, asymptotically efficient, and asymptotically normally distributed (e.g., Azzalini, 1996; Pawitan, 2001). When ML estimation is used as the estimation method, the test statistic − > /() has an asymptotically standard normal distribution, where is the maximum likelihood estimate (MLE), () is the estimated standard error, and > is the population value. Consequently, the 100(1 − α)% Wald CI for > can be computed as where x 1−α/2 is the (1 − α/2)th percentile of the standard normal score.

If the parameter of interest is a complicated function of other parameters, the might not be directly obtainable. In these cases, the delta method can be used to approximate the () (e.g., Agresti, 2002; Casella & Berger, 2002; Rice, 1995; see also Raykov & Marcoulides, 2004, for an introduction). The delta method has been applied to obtain the s of the parameters in a number of contexts, for example, partial and semipartial correlations (Olkin & Finn, 1995), indirect effects (Sobel, 1982), and scale reliability (Raykov, 2002).

Likelihood-Based CIs

Besides inverting a Z statistic to form a Wald CI, it is also possible to construct a likelihood-based CI by inverting a likelihood ratio (LR) statistic. When the null hypothesis (H 0: > = > 0) is correct, it can be shown that the LR statistic defined as is asymptotically distributed as a chi-square variate with g degrees of freedom (df), where L(.) is the likelihood function; log(.) is the natural logarithm; and > 0 are the unrestricted and the restricted estimates, respectively; and g is the number of independent constraints imposed by the null hypothesis (e.g., Buse, 1982).

From Equation 3, the difference of two LR statistics on two estimates of the same parameter, where one is the MLE (treated as fixed) and the other is varied, is asymptotically distributed as a chi-square variate with 1 df. To construct a 100(1 − α)% likelihood-based CI ( Lower , Upper ) on a parameter, we move the parameter estimate (treated as varied) as far away as possible to the right from its MLE such that it is just statistically significant at the desired α significance level. That is, the Upper can be obtained by gradually increasing the estimate until where χ2 1,1−α is the critical value of the chi-square statistic with 1 df and with a α significance level, and χ2 Upper and χ2 MLE are the chi-square statistics of Upper and the MLE , respectively.

If there is more than one parameter in the likelihood function, this approach can be modified by using the profile likelihood method (Pawitan, 2001). The profile likelihood method reduces the likelihood function with multiple parameters to a likelihood function with a single parameter by treating other parameters as nuisance parameters and maximizing over them. To reduce the computational burden, researchers can approximate the profile likelihood functions with various numerical methods (e.g., Neale & Miller, 1997; Venzon & Moolgavkar, 1988). In this article, I use the generic term likelihood-based CIs to denote CIs formed by true likelihood functions or (approximate) profile likelihood functions.

Relationship Between the Wald CI and the Likelihood-Based CI

It can be shown that the Wald statistic is an approximation of the LR statistic by using a second-order Taylor's expansion of the log-likelihood function log L(>) around the MLE (e.g., Pawitan, 2001, p. 33): where d(log L())/d > and d 2(log L())/d > 2 are the first and second derivatives of log L(>) evaluated at . As the first derivative on the log-likelihood function evaluated at is zero at the MLE, Equation 5 reduces to where I() = −d 2(log L())/d > 2 is the observed Fisher information that indicates the curvature of the quadratic approximation of the log-likelihood function. The () can be obtained by 1/√I(). 1

Based on Equation 6, we can construct a Wald statistic for testing H 0: > − > 0 versus H 1: > ≠ > 0, By taking the square root of W, we have the famous result that ( − > 0)/() is asymptotically normally distributed. It is well known that the LR and the Wald statistics are asymptotically equivalent (e.g., Buse, 1982).

Comparisons Between the Wald CI and the Likelihood-Based CI

Wald CIs

The Fisher information matrix I(), thus the (), is usually available after obtaining the MLE. Therefore, it is convenient to construct Wald CIs. As shown in Equations 5 and 6, the Wald approach is based on the second-order quadratic approximation of the log-likelihood function. The Wald CI and the likelihood-based CI would be exactly the same in a few special cases only (see Buse, 1982).

In most cases, the Wald CI and the likelihood-based CI will differ. The appropriateness of using the Wald statistic to approximate the log-likelihood function depends on several factors, such as the model being analyzed, the sampling distribution of the parameter estimate, and the sample size. In reality, however, it is hard to tell whether the quadratic approximation is good or not. One method to determine this is to plot the true log-likelihood and its quadratic approximation graphically. Researchers can visually check the appropriateness of the quadratic approximation (see Pawitan, 2001, for some examples).

There are several criticisms about the use of Wald CIs. Wald CIs based on s assume implicitly that the log-likelihood function for the quantity of interest can be closely approximated by a quadratic function, and hence is symmetric. Therefore, Wald CIs are always symmetric around the MLE. However, in many cases the actual likelihood function is asymmetric, in which case the quadratic approximation underlying the Wald CIs can work poorly unless large samples are involved (Pawitan, 2001). For example, the sampling distributions of the parameters of the variances, correlation coefficients that are close to ±1, and the product term of two random variables in indirect effects, are usually asymmetric. In these cases, the symmetric Wald CIs are too optimistic in ruling out values of the parameter at one end and too pessimistic in ruling out values of the parameter at the other end (DiCiccio & Efron, 1996).

A related issue with the symmetric CIs is that the CIs might be outside of the meaningful boundaries, for example, a negative lower limit for the variance. Although these CIs can be truncated to the meaningful bounds, say zero for variance and ±1 for correlation, Steiger and Fouladi (1997) warned that the coverage probability for the truncated CI is maintained; however, the width of the CI might be suspicious as an indicator for the precision of the measurement because of the truncation of the nonsensible values. 2

Another problem with Wald CIs (and the significance tests based on Wald statistics) is that they are not invariant to monotonic transformations on the parameters (DiCiccio & Efron, 1996; Neale & Miller, 1997). For example, the coverage probabilities of the Wald CIs on a parameter, say a Pearson correlation, and its monotonic transformation, say a Fisher's z-transformed score, need not be the same. This issue is particularly annoying in SEM because there might be many equivalent models formed by different model parameterizations. Inferences based on the Wald CIs (and s) on different equivalent models might be totally different (Gonzalez & Griffin, 2001; Neale & Miller, 1997; Neale, Heath, Hewitt, Eaves, & Fulker, 1989).

Likelihood-based CIs

Likelihood-based CIs offer many improvements over Wald CIs (e.g., Meeker & Escobar, 1995; Neale & Miller, 1997; Pawitan, 2001). For example, they use the log-likelihood function directly instead of its quadratic approximation, they are asymmetric in capturing the sampling distribution of the parameter estimates, and they are invariant to monotonic transformations.

Likelihood-based CIs have been suggested as alternatives to Wald CIs in areas where the Wald CIs are known to perform poorly. For example, Agresti (2002) recommended the use of likelihood-based CIs over Wald CIs in analyzing categorical data when the sample sizes are small to moderate. Similar suggestions have been offered in nonlinear regressions (e.g., Bates & Watts, 1988; Seber & Wild, 1989). More examples in which likelihood-based CIs are used include: random effects in meta-analyses (Hardy & Thompson, 1996; Viechtbauer, 2005), logistic regressions (SAS Publishing, 2004), and generalized linear models (SAS Publishing, 2004). Readers can refer to Pawitan (2001) for more applications utilizing the likelihood approach.

Although the properties of the likelihood-based CIs seem appealing, there are still issues surrounding their use. First and foremost, researchers need to make distributional assumptions on the data to construct likelihood-based CIs. The use of likelihood-based CIs is questionable when the specified log-likelihood function is inappropriate. As Casella and Berger (2002), among others, have cautioned, there is no guarantee that the likelihood-based CIs will be optimum, although they will seldom be too bad. Despite this, they still “recommend constructing a confidence set based on inverting an LRT [likelihood ratio test], if possible” (p. 430).

Constructing Likelihood-Based and Wald CIs in SEM

Wald CIs can be constructed easily based on the that is routinely reported in most statistical packages, whereas likelihood-based CIs are seldom available except in a few statistical packages; for example, SAS (2004) provides likelihood-based CIs on some of its procedures. Constructing likelihood-based CIs for general models might require the use of some special computer programs, such as the Bhat package (Luebeck, 2009) available in R (R Development Core Team, 2007), in which the log-likelihood function of the model has to be given explicitly.

As SEM models subsume many popular statistical techniques as special cases, I decided to use SEM to construct the likelihood-based CIs and the Wald CIs. To construct a likelihood-based or Wald CI using the SEM approach, we need to utilize the phantom variable (Rindskopf, 1984) to store the parameters of interest. Phantom variables are also known as the node in Horn and McArdle (1980) and the auxiliary variable in Raykov and Shrout (2002). They are “latent variables with no observed indicators” (Rindskopf, 1984, p. 38). They are frequently used to implement equality and inequality constraints in SEM (e.g., McArdle & Hamagami, 1996; Rindskopf, 1983; Woody & Sadler, 2005). Raykov (1997, 2001a, 2004; Raykov & Shrout, 2002) extended the use of phantom variables to estimate composite reliability.

This article generalizes the use of phantom variables to estimate the parameters of interest for a variety of models and to construct the CIs on the parameters directly. In the first step the model of interest in SEM is formulated. Then, a phantom variable with zero variance is added. An observed variable arbitrarily regresses on the phantom variable. 3 The path leading from the phantom variable to the observed variable is then constrained to equal the parameter of interest by imposing suitable linear or nonlinear constraints.

Because the variance of the phantom variable is fixed at zero and the path leading from the phantom variable to any arbitrarily selected variable is simply a computed value, the introduction of the phantom variable and the constraints has no effect on the implied covariance matrix, the model fit, and the parameter estimates (Raykov & Shrout, 2002, p. 203). After fitting the model, the parameter of interest is stored in the path leading from the phantom variable. Likelihood-based CI or Wald CI on the parameter can be constructed depending on the SEM packages used. It should be emphasized that the Wald CIs constructed by the use of phantom variables in this SEM approach are equivalent to the CIs constructed based on the delta method. 4

The main advantage of using the SEM approach to constructing Wald CIs is to avoid the tedious calculations involved in the delta method. This article illustrates the procedures to construct the likelihood-based CIs with Mx (Neale, Boker, Xie, & Maes, 2006) and the Wald CIs with LISREL (Jöreskog & Sörbom, 1996) and Mplus (Muthén & Muthén, 2004). 5 Implementation of the phantom variables can be simplified under these packages. LISREL and Mplus provide functions to create new parameters (AP in LISREL and NEW option in Mplus), whereas Mx also provides functions to create matrices storing new parameters.

ILLUSTRATION WITH EMPIRICAL EXAMPLES

In the following sections, common methods of forming CIs on some selected examples are presented. The likelihood-based and Wald CIs constructed by the SEM approach are compared against them. A real data set from the World Values Survey II (World Values Study Group, 1994) was used for illustrative purposes. Between 1990 and 1993, 57,561 adults aged 18 and older from 42 nations were interviewed by local academic institutes in Eastern European nations and by professional survey organizations in other nations. For the sake of illustration I randomly selected a sample of 200 participants without missing values from the total sample of 1,839 Americans. Five variables were selected for demonstration: state of health (SH), life satisfaction (LS), job satisfaction (JS), home-life satisfaction (HS), and job autonomy (JA). Five items (X1–X5) measuring aspects of sexual attitudes were also selected to illustrate the forming of a CI on the coefficient alpha and the reliability estimate. These items were supposed to form a general construct on sexual attitudes. The descriptive statistics for these variables are shown in Table 1.

TABLE 1 Descriptive Statistics of the Variables

In the illustrations, variables are denoted in capital letters and parameters, such as variances and path coefficients, are denoted with small letters in the figures and the text. P and Q represent the phantom variables introduced in the analysis and p and q denote the paths leading from the phantom variables to arbitrarily selected observed variables. Appendix A includes the Mplus codes for the analysis. 6

Example 1: Difference in Dependent Correlations

Hypotheses related to dependent correlation coefficients are of great interest in behavioral research. Olkin and Finn (1995) provided a systematic treatment of how to test dependent correlation coefficients and construct Wald CIs using the delta method (see Hittner, May, & Silver, 2003, for a recent simulation study; and Steiger, 1980a, 1980b, for an alternative approach based on the Fisher's z transformation). For example, to construct the Wald CI on the difference in two dependent correlations (ρ ij − ρ ik ) where ρ ij and ρ ik denote the population correlations for the ith and jth variables, and for the ith and kth variables, respectively, one can use the large sample approximation, where var(r ij ) = (1 − ρ2 ij )2/n, var(r ik ) = (1 − ρ2 ik )2/n, cov(r ij ,r ik ) = (0.5(2ρ jk − ρ ij ρ ik )(1 − ρ2 ij − ρ2 ik − ρ2 jk ) + ρ3 jk )/n, and n is the sample size for the correlation coefficients. Because the population correlations are seldom known, the sample estimate rs are used to substitute the corresponding values of ρs (see Olkin & Finn, 1995). Given the , the Wald CI can then be constructed by Equation 2. In our example, we were interested in constructing a 95% CI on the difference of the correlation coefficients ρ LS.HS and ρ LS.JS . The sample estimate for (ρ LS.HS − ρ LS.JS ) was 0.056. Based on Equation 8, the was 0.076 and the 95% Wald CI on (ρ LS.HS − ρ LS.JS ) was (−0.093, 0.205).

It has been recognized for decades that SEM can be used to test independent and dependent correlations (e.g., Bentler & Lee, 1983; Jöreskog, 1978; McDonald, 1975; see also Steiger, 2005, for a review on comparing correlations). To estimate the difference in the dependent correlations in the SEM approach, we formulated the analysis of dependent correlations as a special case of the confirmatory factor analytic (CFA) model (Cheung & Chan, 2004). In our example, a three-factor CFA model with factor variances fixed as ones, a diagonal matrix for the factor loadings representing the standard deviations of the variables, and zero error variances, was formulated (see Figure 1a).

FIGURE 1 (a) A model on estimating the difference in dependent correlation coefficients. (b) A regression model on regressing LS on HS and JS.

With this model specification, the factor correlations (m, n, and r in Figure 1) were indeed the observed correlations. To calculate the difference between r LS.HS and r LS.JS , we impose the following constraint,

Because the parameter estimates were the same, I reported the and CIs only. The and the 95% Wald CI estimated in Mplus were 0.076 and (−0.093, 0.205), respectively. The Wald CI based on the SEM approach was numerically the same as the one based on Olkin and Finn (1995). The 95% likelihood-based CI formed was (−0.094, 0.208). The likelihood-based CI and the Wald CI are similar in this example.

Example 2: Squared Multiple R and Standardized Regression Coefficient

Squared multiple R

In a multiple regression analysis, the squared multiple correlation (R 2) is one of the most frequently reported statistics. There are several methods of constructing CIs for the R 2 (see Algina, 1999, for a simulation study comparing some of them). Steiger and Fouladi (1997) described how to compute the exact CIs on the R 2 using the noncentrality approach.

An alternative way of constructing the CI on the R 2 is based on the SEM approach. The squared multiple correlation can be expressed by where γ is the m × 1 column vector of the regression coefficients, var(x) is the m × m variance-covariance matrix of the predictors, and var(e) is the unexplained variance of the dependent variable. In our example, we were interested in constructing the CI on the R 2 by regressing LS on JS and HS (see Figure 1b). To estimate the R 2 in our example, we impose the following constraint:

By running a regression analysis on the data, the R 2 was .254. Using the computer program R2 developed by Steiger and Fouladi (1992), the 95% CI was (0.150, 0.357). By fitting the regression model in the SEM approach, the and the 95% Wald CI were 0.053 and (0.150, 0.359), respectively. The 95% likelihood-based CI estimated was (0.155, 0.361). They are close to the values obtained by the computer program R2.

Standardized regression coefficients

Because scales in most psychological measurements are arbitrary, researchers sometimes prefer standardized parameter estimates to parameter estimates in the original scales (e.g., Hunter & Hamilton, 2002). One example is the standardized regression coefficient in multiple regressions. Although it is a simple matter to calculate the standardized regression coefficient x from the unstandardized regression coefficient x by where SD x and SD y are the standard deviations of X and Y, respectively, it is not so easy to construct the CI on x . Because SD x and SD y are both random variables, on x might not equal B x × SD x /SD y , especially in small samples (Bentler & Lee, 1983; Bollen, 1989). With the exception of a few statistical and SEM packages (e.g., RAMONA [Browne & Mels, 1992], SEPATH [Steiger, 1995], and PROC CALIS in the next release [Yung, 2005]), most SEM packages do not report s (and CIs) for the standardized parameters.

Even though it is possible to derive on x with the delta method, the calculations are tedious. An easier method is to use the SEM approach. In Figure 1b, we were interested in obtaining the standardized regression coefficient of JS. To calculate JS , we introduce a phantom variable Q and impose Equation 12 on q; that is, Then, the estimated value on q is JS . CIs on the standardized regression coefficients are easily obtained using the SEM packages.

In our example, the unstandardized JS was 0.206, and the standard deviations of JS and LS were 2.009 and 1.487, respectively. Thus, the standardized regression coefficient was JS 0.278. The and the 95% Wald CI were 0.061 and (0.158, 0.398), respectively, whereas the 95% likelihood-based CI was (0.155, 0.395).

Example 3: Mediating Effects

One mediator

A mediator is a variable accounting for all or part of the relation between a predictor and a dependent variable (Baron & Kenny, 1986). Mediating or indirect effects are widely conceptualized in psychological research. Suppose that we wanted to test whether JS was a mediator between JA and LS. We might formulate a path model for testing (see Figure 2a). The estimated indirect effect is simply ab. It is common to form CIs on the mediating effect with the Wald approach and the bootstrap method (see MacKinnon, Fairchild, & Fritz, 2007; MacKinnon, Lockwood, Hoffman, West, & Sheets, 2002; MacKinnon et al., 2004). Cheung (2007) compared several CIs and found that the likelihood-based CI and the bootstrap CI were outperformed the Wald CI in most settings.

FIGURE 2 (a) A path model with one mediator. (b) A path model with two specific mediators. (c) A path model with two intermediate mediators.

Sobel (1982), based on the delta method, derived the on the product term ab as where a and b are the asymptotic standard errors of a and b, respectively. As the term 2 a 2 b is relatively small, it is sometimes dropped in the calculation (e.g., Baron & Kenny, 1986). The Wald CI on the mediating effect can then be constructed by Equation 2. As shown in Figure 2a, we can impose the following constraint to estimate the indirect effect:

In our example, the estimates of a and its were 0.460 and 0.050, and the estimates of b and its were 0.268 and 0.059. The estimated indirect effect was then 0.123. Based on Sobel's (1982) formula, the ab was 0.030. The 95% Wald CI on the indirect effect was (0.064, 0.182). By using the SEM approach, the and the 95% Wald CI estimated in Mplus were 0.030 and (0.065, 0.182), respectively. The Wald CIs based on Sobel's formula and the SEM approach are essentially the same. The 95% likelihood-based CI that was constructed was (0.068, 0.187), respectively. Both the Wald and the likelihood-based CIs are comparable.

Two specific indirect effects

Psychological theories sometimes involve multiple specific mediators (MacKinnon, 2000). Multiple specific mediators mean that the effects from the predictor to the dependent variable are separately mediated by more than one mediator. In our example, suppose that we wanted to test whether LS and JS might be two competing specific mediators (see Figure 2b). The specific mediating effect via LS is ab and the specific mediating effect via JS is cd. It is often of interest to estimate the total mediating effect (ab + cd) and the difference of these specific mediating effects (abcd; Bollen, 1987).

Based on the delta method, MacKinnon (2000, pp. 148, 151) provided the following formulas to estimate the ab+cd on the total mediating effect and abcd on the difference between two specific mediating effects: where cov(b, d) is the asymptotic covariance between the estimates of b and d. Given the s, Wald CIs on the total and the difference of the specific mediating effects can then be constructed.

An alternative approach is to use SEM. We can fit the model as the one in Figure 2b. To estimate the total mediating effect and the difference between the specific mediating effects, we impose the following constraints: and

After fitting this model, the indirect effects for ab and cd for our data were 0.008 and 0.016, respectively. The estimate of the total mediating effect was 0.024. The ab+cd and the 95% Wald CI were 0.014 and (−0.004, 0.051), respectively. The 95% likelihood-based CI on the total mediating effect was (0.002, 0.048). The difference between the specific indirect effects was −0.007. The abcd and the 95% Wald CI were 0.014 and (−0.035, 0.020), respectively, and the 95% likelihood-based CI was (−0.037, 0.021). Both CIs are comparable.

Indirect effect with two intermediate mediators

Psychological theories can involve more than one intermediate mediator. That is, the mediating effect from an independent variable to a dependent variable is hypothesized via two or more consecutive mediators (see Premack & Hunter, 1988, for an example). In our example, for instance, we might want to test a model specifying the effect of JA to SH consecutively mediated by JS and LS (see Figure 2c). The indirect effect with two intermediate mediators can be easily estimated by abc. The mediating effect with two intermediate mediators can be estimated by introducing the following constraint:

In our example, the estimated mediating effect and its abc were 0.008 and 0.005, respectively. Thus, the 95% Wald CI was (−0.002, 0.017). The 95% likelihood-based CI formed was (−0.001, 0.019). Both of these CIs are comparable.

Example 4: Reliability Estimates

It has been widely recognized that psychological measurements are subject to measurement errors. If the true-score theory is used, reliability estimates for the data being analyzed should be reported even if the focus of the research is not psychometric (Wilkinson & Task Force on Statistical Inference, 1999). It would even be better, however, if CIs on the reliability estimates were to be reported (Fan & Thompson, 2001).

Cronbach's coefficient alpha

Cronbach's (1951) coefficient alpha (α) is the most popular index for measuring the internal consistency of a measurement. The sample estimate of the coefficient alpha is where t is the number of items in the scale, V is the variance–covariance matrix among the items, tr(.) is the trace operator summing all diagonal elements of a square matrix, and j is a t × 1 vector of ones.

Duhachek and Iacobucci (2004) compared several methods of constructing CIs on the coefficient alpha. They found that the method proposed by van Zyl, Heinz, and Nel (2000) performed best in terms of the coverage probability. The estimated standard error proposed by van Zyl et al. is: where n is the sample size (see also Kistner & Muller, 2004, for a method to obtain the exact sampling distribution of coefficient alphas).

Supposing that X1 to X5 in Table 1 measure the same construct, we might want to estimate its coefficient alpha. To estimate the coefficient alpha in the SEM approach, we may fit a saturated model on the covariance matrix as shown in Figure 3a. To compute the coefficient alpha, we impose the following constraint:

FIGURE 3 (a) A model for estimating Cronbach's alpha. (b) A one-factor confirmatory factor analytic model for estimating the reliability.

The estimated coefficient alpha and its a in our data were 0.788 and 0.023 (by Equation 22), respectively. The 95% Wald CI was (0.744, 0.832). (See Duhachek & Iacobucci, 2004, for the SPSS code to conduct the computations.) By using the SEM approach, the a and its 95% Wald CI that was constructed were 0.022 and (0.744, 0.832), respectively, which were essentially the same as the ones based on Equation 22. The 95% likelihood-based CI that was constructed was (0.740, 0.829). Both the likelihood-based and the Wald CIs are similar.

Reliability estimates

As is well known, with uncorrelated errors the coefficient alpha equals the reliability only when the items are essentially tau-equivalent. Essentially tau-equivalent means that the items have equal true-score variances, but with possibly different error variances (e.g., Bollen, 1989; Raykov, 1997). When the items are not essentially tau-equivalent or the measurement errors are correlated, the coefficient alpha might overestimate or underestimate the true reliability (e.g., Lord & Novick, 1968; Raykov, 2002; Raykov & Shrout, 2002). A more appropriate estimate of the reliability is based on the estimated true-score variance to the observed score variance (e.g., Bollen, 1989; Miller, 1995; Raykov, 1997, 2002; Raykov & Shrout, 2002).

To come up with an estimate of reliability, we might fit a one-factor CFA model as shown in Figure 3b, where the factor variance of F is fixed as 1.0 for identification and P is the phantom variable computing the reliability estimate. The estimated scale reliability xx for an unweighted composite score on t items is defined as where i and 2 E i are the estimated factor loading and error variance of the ith item (e.g., Raykov, 2002). The preceding formula assumes that the measurement errors are uncorrelated. If the measurement errors are correlated, it is easy to extend it by including the covariances of the measurement errors in the denominator (see Raykov & Shrout, 2002).

We can then calculate the reliability estimate with a phantom variable and directly construct its likelihood-based CI and Wald CI depending on which SEM packages are used. 7

To estimate the reliability, we impose the following equality constraint, The one-factor CFA model fitted the data reasonably well with χ2(5) = 11.452, p = .043; sample size = 200; the comparative fit index (CFI) = 0.976; the root mean squared error of approximation (RMSEA) = 0.080, 95% CI of RMSEA = (0.013, 0.143); and the standardized root mean squared residual (SRMR) = 0.036. The reliability estimate was .804. The and the 95% Wald CI estimates were 0.022 and (0.761, 0.847), respectively, and the 95% likelihood-based CI that was constructed was (0.757, 0.844). Both the likelihood-based CI and the Wald CI are similar.

A MONTE CARLO SIMULATION STUDY

Purposes of the Simulation Study

Theoretically, the likelihood-based approach is preferred to the Wald approach in constructing CIs. However, the preceding illustrations seem to suggest that likelihood-based CIs and Wald CIs are comparable. Readers might be left with the impression that there is practically no difference between likelihood-based CIs and Wald CIs.

It is questionable whether the similarities and the differences between the likelihood-based CIs and the Wald CIs shown in the numerical examples are interpretable. Yung and Bentler (1996) argued that numerical examples mainly serve illustrative purposes. They suggested that simulation studies are generally preferred to numerical examples in evaluating the appropriateness of a method.

To partially address the issue that the likelihood-based CIs and the Wald CIs in the illustrations are highly similar, a simulation study was conducted on the Pearson correlation. The correlation was selected because it is one of the most common statistics in psychological research. The results might help researchers determine whether to employ likelihood-based CIs or Wald CIs. Although it is also possible to construct a CI on the Fisher's z transformation for normalizing the correlation distribution, the Fisher's z-transformed score was not investigated because the focus was to compare likelihood-based CI or Wald CI.

Method

Bivariate normal data with known population correlations were generated with Mplus 3.12 (Muthén & Muthén, 2004). A model similar to the one shown in Figure 1a was used to analyze the correlation coefficient with the SEM approach. The 95% Wald CI and the 95% likelihood-based CI on the correlation were constructed with Mplus and Mx, respectively.

Two factors were manipulated. They were the population correlations and the sample sizes. There were four levels for the population correlations: .1, .3, .5, and .8. The levels for the sample sizes were 50, 100, 200, 500, and 1,000. These levels were selected so that the findings would be relevant to the applied settings. Five thousand replications were generated in each condition. The sample codes to conduct the simulation studies are given in Appendix B.

There were two criteria in evaluating the performance of the CIs that were formed (see Casella & Berger, 2002). The first one was the coverage probability of the 95% CIs that were constructed. If the CIs were good, it was expected that the 95% CIs constructed would have approximately 95% coverage probability including the true population correlations. The second one was the width of the CIs that were formed, which is defined as Upper Lower . The width indicates the precision of the parameter estimates. Given the same accuracy in the coverage probability, it is preferable to have CIs of the shortest length.

Results and Discussion

Figure 4 shows the coverage probabilities of the 95% CIs that were formed. There were several consistent findings. First, it could be observed that the performance of both the likelihood-based CI and the Wald CI improved when the sample sizes became larger. When the sample sizes were small, both CIs did not reach the prespecified coverage probability. In contrast, the effect of the population correlation was not very strong except when the sample sizes were small. The patterns were nearly the same for all effect sizes. The most important finding was that the likelihood-based CI usually had a better coverage probability than the Wald CI in nearly all conditions. The differences became larger when the sample sizes were smaller.

FIGURE 4 Coverage probabilities of the 95% confidence intervals on the correlation coefficient.

Figure 5 shows the width of the 95% CIs that were formed on the correlation coefficients. The findings indicated that the widths of the likelihood-based CI and the Wald CI were nearly the same in all conditions. To summarize, the likelihood-based CI was more accurate than the Wald CI in terms of coverage probability, especially when the sample sizes were small. At the same time, the widths of the likelihood-based CIs and the Wald CIs were similar.

FIGURE 5 Width of the 95% confidence intervals on the correlation coefficient.

CONCLUSIONS AND GENERAL DISCUSSION

The goals of this article were to argue for the usefulness of the likelihood-based CI over the Wald CI and to show how SEM might be used as a general, but simple, approach to constructing different types of CIs for parameters of interest. Once the models of interest are formulated in SEM, phantom variables can be used to compute the parameters of interest. In many SEM packages, such as Mplus, LISREL, and Mx, new parameters can be easily created. Likelihood-based and Wald CIs can be constructed depending on the SEM packages used.

The results of the simulation study show that the coverage probability of the likelihood-based CI is more accurate than that of the Wald CI in a Pearson correlation. At the same time, the width of the likelihood-based CI and the Wald CI is nearly the same. The SEM approach provides a simple method of constructing the CIs that are recommended by the APA Task Force on Statistical Inference (Wilkinson & Task Force on Statistical Inference, 1999) and the Publication Manual of the APA (APA, 2001).

Extensions and Issues for Further Research

Extensions to multiple-group analyses

Although my illustrations focus on the single-group analysis, the use of phantom variables to constructing CIs is readily extended to multiple-group analyses. For example, Raykov (2005) proposed an SEM approach to test the maximal reliability estimate in two independent groups. His method can be modified with the approach presented in this article to directly obtain the differences between the reliability estimates and their CIs. Another example is to test whether the mediating effect is the same in two independent groups (Cheung, 2007), which is known as the moderated mediation (James & Brett, 1984).

The ANOVA model is one of most important models in behavioral research (see Smithson, 2003; Steiger, 2004; Steiger & Fouladi, 1997). Although the noncentrality approach proposed by Steiger (2004; Steiger & Fouladi, 1997) is general enough to account for a variety of effect sizes in the ANOVA model, there are still limitations on the noncentrality approach that apply to unbalanced designs, repeated measures, or both (see Steiger, 2004). As SEM can be used to handle between-subjects ANOVA and repeated measures with unbalanced designs (e.g., Bagozzi & Yi, 1999; Cole et al., 1993; Raykov, 2001b; Sörbom, 1974), it is natural to extend the SEM approach to constructing CIs for different ANOVA models. It is of interest to compare the results based on the noncentrality and the SEM approaches.

Further comparison between likelihood-based CIs and Wald CIs

Numerical illustrations do not show practical differences between the likelihood-based CIs and the Wald CIs. Because likelihood-based CIs (or CIs in general) are not widely reported in psychological research, there are only a limited number of simulation studies comparing the likelihood-based CIs and the Wald CIs relevant to behavioral research (Cheung, in press; Cheung, 2007).

A related concern is the use of Wald CIs and likelihood-based CIs in small-to-moderate size samples. Both LR and Wald statistics are distributed as chi-square variates in large samples only. Several factors, such as the complexity of the models and the distribution of the data, can affect the sample sizes required to apply the asymptotic theory. Our preliminary simulation study with the Pearson correlation shows that the likelihood-based CI outperforms the Wald CI in small samples. However, further research may be required to clarify how these CIs performed in other settings.

Methods of handling nonnormal data

The assumption of multivariate normality on the data is required in constructing likelihood-based CIs and Wald CIs. However, this assumption might not always be tenable in data (Micceri, 1989). One approach to analyzing nonnormal data is to use bootstrap methods (Davison & Hinkley, 1997; Efron & Tibshirani, 1993). Bootstrap methods use the empirical distribution of the statistics to approximate the theoretical distribution of the statistics. They have been frequently applied to construct CIs in applied research, for instance, indirect effects (Bollen & Stine, 1990; Cheung, 2007; MacKinnon et al., 2004; Shrout & Bolger, 2002) and reliability estimates (Raykov, 1998; Raykov & Shrout, 2002).

Many current SEM packages include some types of bootstrap procedures that can be used to construct bootstrap CIs (e.g., Fan, 2003). Percentile bootstrap CIs can be constructed in LISREL via PRELIS and in Mx, the latter of which provides an interface to R (R Development Core Team, 2007) to report the summary statistics. Percentile bootstrap CIs and bias-corrected (BC) bootstrap CIs are implemented in Mplus, and the model-based bootstrap (Bollen & Stine, 1992) is implemented in Mplus 4 and EQS 6.1. DiCiccio and Efron (1996) also suggested other candidates for bootstrap CIs. These include bias-corrected and accelerated (BCa) CIs and approximate bootstrap confidence intervals (ABC) bootstrap CIs. It is of interest to investigate how these bootstrap CIs work under this SEM approach.

Limitations of the numerical approximation in constructing likelihood-based CIs

Although it is suggested in this article that likelihood-based CIs are better alternatives to Wald CIs, readers should be cautioned the likelihood-based CIs were constructed via the numerical approximation of the profile likelihood functions implemented in Mx. As Neale et al. (2006) stressed, “optimization is not (yet) an exact science” (p. 88), so it is possible that the numerical approximations might fail to find the likelihood-based CIs, especially when the parameters are complicated functions with many nonlinear constraints. Finally, I agreed with Steiger's (2001) view that the SEM packages might not keep up with methodological developments and the needs of researchers. With the advance of computing power, it is hoped that these CIs (the likelihood-based CIs and different bootstrap CIs) will be implemented in common SEM packages in the near future. SEM might then be a truly flexible framework for researchers seeking to construct CIs.

ACKNOWLEDGMENTS

Preparation of this work was supported by the Academic Research Fund Tier 1 (R-581-000-064-112) from the Ministry of Education, Singapore. Portions of this article were presented at the International Meeting of the Psychometric Society, July 4–9, 2005, in Tilburg, The Netherlands. I would like to thank Michael Neale and Gerhard Mels for providing advice on the use of Mx and LISREL, and Tenko Raykov for providing useful comments on the article. I would also like to express my sincere gratitude to several reviewers for their valuable suggestions on improving the article.

Notes

1The expected Fisher information, which is defined as −E(d 2(log L(>))/d > 2), can also be used to estimate standard errors. However, the observed Fisher information is usually preferred (see Azzalini, 1996; Pawitan, 2001).

2It should be noted that when the parameter is tested at a boundary, say zero for variance and ±1 for correlation, the LR statistic is not distributed as χ2(1). It is distributed as a mixture of χ2(0) and χ2(1) (see Stoel, Galindo-Garre, Dolan, & van den Wittenboer, 2006). Thus, both Wald and likelihood CIs might be incorrect.

3Equivalently, it is also possible to set the phantom variable so that it is uncorrelated with other variables. The variance of the phantom variable can then be used to “store” the parameter of interest by imposing suitable constraints.

4LISREL (Jöreskog & Sörbom, 1996, pp. 345–347) uses the delta method to calculate the of the parameters with constraints, and Mplus also employs the delta method to calculate the of the parameters with constraints.

5To implement all of the examples discussed in this article, the SEM packages should be able to implement linear and nonlinear constraints on the parameters. The current versions of AMOS (6.0) and EQS (6.1) do not allow nonlinear constraints.

6The complete codes and output in Mplus, LISREL, and Mx are available at http://courses.nus.edu.sg/course/psycwlm/internet/.

7Dr. Phillip Wood posted some Mplus code to calculate the same reliability estimate at SEMNET on November 17, 2004.

    REFERENCES

  • Agresti, A. 2002. Categorical data analysis, , 2nd ed., Hoboken, NJ: Wiley.  [Crossref][Google Scholar]
  • Algina, J. 1999. A comparison of methods for constructing confidence intervals for the squared multiple correlation coefficient. Multivariate Behavioral Research, 34: 493504.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • American Psychological Association. 2001. Publication manual of the American Psychological Association, , 5th ed., Washington, DC: Author.  [Google Scholar]
  • Azzalini, A. 1996. Statistical inference: Based on the likelihood, London: Chapman & Hall.  [Google Scholar]
  • Bagozzi, R. P. and Yi, Y. 1999. On the use of structural equation models in experimental designs. Journal of Marketing Research, 26: 271285.  [Crossref][Google Scholar]
  • Baron, R. M. and Kenny, D. A. 1986. The moderator-mediator distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51: 11731182.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Bates, D. M. and Watts, D. G. 1988. Nonlinear regression analysis and its applications, New York: Wiley.  [Crossref][Google Scholar]
  • Bauer, D. J. 2003. Estimating multilevel linear models as structural equation models. Journal of Educational and Behavioral Statistics, 28: 135167.  [Crossref], [Web of Science ®][Google Scholar]
  • Bentler, P. M. 1995. EQS structural equations program manual, Encino, CA: Multivariate Software.  [Google Scholar]
  • Bentler, P. M. and Lee, S. Y. 1983. Covariance structures under polynomial constraints: Applications to correlation and alpha-type structural models. Journal of Educational Statistics, 8: 207222.  [Crossref][Google Scholar]
  • Bollen, K. A. 1987. “Total direct and indirect effects in structural equation models”. In Sociological methodology, Edited by: Clogg, C. C. 3769. Washington, DC: American Sociological Association.  [Crossref][Google Scholar]
  • Bollen, K. A. 1989. Structural equations with latent variables, New York: Wiley.  [Crossref][Google Scholar]
  • Bollen, K. A. and Stine, R. 1990. Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology, 20: 115140.  [Crossref][Google Scholar]
  • Bollen, K. A. and Stine, R. A. 1992. Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods & Research, 21: 205229.  [Crossref], [Web of Science ®][Google Scholar]
  • Browne, M. W. and Mels, G. 1992. RAMONA, PC user's guide, Columbus, OH: Ohio State University, Department of Psychology.  [Google Scholar]
  • Buse, A. 1982. The likelihood ratio, Wald, and Lagrange Multiplier tests: An expository note. The American Statistician, 36: 153157.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Casella, G. and Berger, R. L. 2002. Statistical inference, , 2nd ed., Pacific Grove, CA: Duxbury/Thomson Learning.  [Google Scholar]
  • Cheung, M. W. L. 2007. Comparison of approaches to constructing confidence intervals for mediating effects using structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 14: 227246.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Cheung, M. W. L. 2008. A model for integrating fixed-, random-, and mixed-effects meta-analyses into structural equation modeling. Psychological Methods, 13: 182202.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Cheung, M. W. L. Comparison of methods for constructing confidence intervals of standardized indirect effects. Behavior Research Methods., (in press) [Web of Science ®][Google Scholar]
  • Cheung, M. W. L. and Chan, W. 2004. Testing dependent correlation coefficients via structural equation modeling. Organizational Research Methods, 7: 206223.  [Crossref], [Web of Science ®][Google Scholar]
  • Cheung, M. W. L. and Chan, W. 2005. Meta-analytic structural equation modeling: A two-stage approach. Psychological Methods, 10: 4064.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Chow, S. L. 1988. Significance test or effect size?. Psychological Bulletin, 103: 105110.  [Crossref], [Web of Science ®][Google Scholar]
  • Cohen, J. 1994. The world is round (p < .05). American Psychologist, 49: 9971003.  [Crossref], [Web of Science ®][Google Scholar]
  • Cole, D. A., Maxwell, S. E., Arvey, R. and Salas, E. 1993. Multivariate group comparisons of variable systems: MANOVA and structural equation modeling. Psychological Bulletin, 114: 174184.  [Crossref], [Web of Science ®][Google Scholar]
  • Cronbach, L. J. 1951. Coefficient alpha and the internal structure of tests. Psychometrika, 53: 6370.  [Crossref][Google Scholar]
  • Cumming, G. and Finch, S. 2001. A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions. Educational and Psychological Measurement, 61: 532574.  [Crossref], [Web of Science ®][Google Scholar]
  • Curran, P. J. 2003. Have multilevel models been structural equation models all along?. Multivariate Behavioral Research, 38: 529569.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Curran-Everett, D. and Benos, D. J. 2004. Guidelines for reporting statistics in journals published by the American Physiological Society. Journal of Neurophysiology, 92: 669671.  [Crossref], [Web of Science ®][Google Scholar]
  • Davison, A. C. and Hinkley, D. V. 1997. Bootstrap methods and their application, New York: Cambridge University Press.  [Crossref][Google Scholar]
  • DiCiccio, T. J. and Efron, B. 1996. Bootstrap confidence intervals. Statistical Science, 11: 189228.  [Crossref], [Web of Science ®][Google Scholar]
  • Duhachek, A. and Iacobucci, D. 2004. Alpha's standard error (ASE): An accurate and precise confidence interval estimate. Journal of Applied Psychology, 89: 792808.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Efron, B. and Tibshirani, R. J. 1993. An introduction to the bootstrap, New York: Chapman & Hall.  [Crossref][Google Scholar]
  • Fan, X. 2003. Using commonly available software for bootstrapping in both substantive and measurement analyses. Educational and Psychological Measurement, 63: 2450.  [Crossref], [Web of Science ®][Google Scholar]
  • Fan, X. and Thompson, B. 2001. Confidence intervals about score reliability coefficients, please: An EPM guidelines editorial. Educational and Psychological Measurement, 61: 517531.  [Crossref], [Web of Science ®][Google Scholar]
  • Gonzalez, R. and Griffin, D. 2001. Testing parameters in structural equation modeling: Every “one” matters. Psychological Methods, 6: 258269.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Hahn, G. J. and Meeker, W. Q. 1991. Statistical intervals: A guide for practitioners, New York: Wiley.  [Crossref][Google Scholar]
  • Hardy, R. J. and Thompson, S. G. 1996. A likelihood approach to meta-analysis with random effects. Statistics in Medicine, 15: 619629.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Harlow, L. L., Mulaik, S. A. and Steiger, J. H. 1997. What if there were no significance tests?, Mahwah, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • Hittner, J. B., May, K. and Silver, N. C. 2003. A Monte Carlo evaluation of tests for comparing dependent correlations. Journal of General Psychology, 130: 149168.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Horn, J. L. and McArdle, J. J. 1980. “Perspective on mathematical and statistical model building (MASMOB) in aging research”. In Aging in the 1980s: Psychological issues, Edited by: Poon, L. W. 503541. Washington, DC: American Psychological Association.  [Crossref][Google Scholar]
  • Hunter, J. E. and Hamilton, M. A. 2002. The advantages of using standardized scores in causal analysis. Human Communication Research, 28: 552561.  [Crossref], [Web of Science ®][Google Scholar]
  • International Committee of Medical Journal Editors. 2004. Uniform requirements for manuscripts submitted to biomedical journals: Writing and editing for biomedical publication Retrieved March 4, 2005, from http://icmje.org [Google Scholar]
  • James, L. and Brett, J. M. 1984. Mediators, moderators and tests for mediation. Journal of Applied Psychology, 69: 307321.  [Crossref], [Web of Science ®][Google Scholar]
  • Jöreskog, K. G. 1978. Structural analysis of covariance and correlation matrices. Psychometrika, 43: 443477.  [Crossref], [Web of Science ®][Google Scholar]
  • Jöreskog, K. G. and Sörbom, D. 1996. LISREL 8: A user's reference guide, Chicago: Scientific Software International.  [Google Scholar]
  • Kistner, E. O. and Muller, K. E. 2004. Exact distributions of intraclass correlation and Cronbach's alpha with Gaussian data and general covariance. Psychometrika, 69: 459474.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Lord, F. M. and Novick, M. 1968. Statistical theories of mental test scores, Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • Luebeck, E. G. 2009. Bhat: General likelihood exploration Retrieved March 6, 2009, from http://www.r-project.org[R package version 0.9-08] [Google Scholar]
  • MacKinnon, D. P. 2000. “Contrasts in multiple mediator models”. In Multivariate applications in substance use research: New methods for new questions, Edited by: Rose, J. S., Chassin, L., Presson, C. C. and Sherman, S. J. 141160. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • MacKinnon, D. P., Fairchild, A. J. and Fritz, M. S. 2007. Mediation analysis. Annual Review of Psychology, 58: 593614.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G. and Sheets, V. 2002. A comparison of methods to test the significance of the mediated effect. Psychological Methods, 7: 83104.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • MacKinnon, D. P., Lockwood, C. M. and Williams, J. 2004. Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39: 99128.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • McArdle, J. J. and Hamagami, E. 1996. “Multilevel models from a multiple group structural equation perspective”. In Advanced structural equation modeling techniques, Edited by: Marcoulides, G. and Schumacker, R. 89124. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • McDonald, R. P. 1975. Testing pattern hypotheses on correlation matrices. Psychometrika, 40: 253255.  [Crossref], [Web of Science ®][Google Scholar]
  • Meeker, W. Q. and Escobar, L. A. 1995. Teaching about approximate confidence regions based on maximum likelihood estimation. American Statistician, 49: 4853.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Mehta, P. D. and Neale, M. C. 2005. People are variable too: Multilevel structural equation modeling. Psychological Methods, 3: 259284.  [Crossref][Google Scholar]
  • Micceri, T. 1989. The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105: 156166.  [Crossref], [Web of Science ®][Google Scholar]
  • Miller, M. B. 1995. Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 2: 255273.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Muthén, L. K. and Muthén, B. O. 2004. Mplus user's guide, , 3rd ed., Los Angeles: Muthén & Muthén.  [Google Scholar]
  • Neale, M. C., Boker, S. M., Xie, G. and Maes, H. H. 2006. Mx: Statistical modeling, , 7th ed., Richmond, VA: Virginia Commonwealth University, Department of Psychiatry.  [Google Scholar]
  • Neale, M. C., Heath, A. C., Hewitt, J. K., Eaves, L. J. and Fulker, D. W. 1989. Fitting genetic models with LISREL: Hypothesis testing. Behavior Genetics, 19: 3769.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Neale, M. C. and Miller, M. B. 1997. The use of likelihood-based confidence intervals in genetic models. Behavior Genetics, 27: 113120.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Nickerson, R. S. 2000. Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5: 241301.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Olkin, I. and Finn, J. D. 1995. Correlation redux. Psychological Bulletin, 118: 155164.  [Crossref], [Web of Science ®][Google Scholar]
  • Pawitan, Y. 2001. In all likelihood: Statistical modeling and inference using likelihood, Oxford, UK: Oxford University Press.  [Google Scholar]
  • Premack, S. L. and Hunter, J. E. 1988. Individual unionization decisions. Psychological Bulletin, 103: 223234.  [Crossref], [Web of Science ®][Google Scholar]
  • R Development Core Team. 2007. R: A language and environment for statistical computing, Vienna, Austria: R Foundation for Statistical Computing.  [Google Scholar]
  • Raykov, T. 1997. Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21: 173184.  [Crossref], [Web of Science ®][Google Scholar]
  • Raykov, T. 1998. A method for obtaining standard errors and confidence intervals of composite reliability for congeneric items. Applied Psychological Measurement, 22: 369374.  [Crossref], [Web of Science ®][Google Scholar]
  • Raykov, T. 2001a. Estimation of congeneric scale reliability using covariance structure analysis with nonlinear constraints. British Journal of Mathematical and Statistical Psychology, 54: 315323.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Raykov, T. 2001b. Testing multivariate covariance structure and means hypotheses via structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 8: 224256.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Raykov, T. 2002. Analytic estimation of standard error and confidence interval for scale reliability. Multivariate Behavioral Research, 37: 89103.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Raykov, T. 2004. Point and interval estimation of reliability for multiple-component measuring instruments via linear constraint covariance structure modeling. Structural Equation Modeling: A Multidisciplinary Journal, 11: 342356.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Raykov, T. 2005. Studying group and time invariance in maximal reliability for multiple-component measuring instruments via covariance structure modeling. British Journal of Mathematical and Statistical Psychology, 58: 301317.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Raykov, T. and Marcoulides, G. A. 2004. Using the delta method for approximate interval estimation of parameter functions in SEM. Structural Equation Modeling: A Multidisciplinary Journal, 11: 621637.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Raykov, T. and Shrout, P. E. 2002. Reliability of scales with general structure: Point and interval estimation using a structural equation modeling approach. Structural Equation Modeling: A Multidisciplinary Journal, 9: 195212.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Rice, J. A. 1995. Mathematical statistics and data analysis, , 2nd ed., Belmont, CA: Duxbury.  [Google Scholar]
  • Rindskopf, D. 1983. Parameterizing inequality constraints on unique variances in linear structural models. Psychometrika, 48: 7383.  [Crossref], [Web of Science ®][Google Scholar]
  • Rindskopf, D. 1984. Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika, 49: 3747.  [Crossref], [Web of Science ®][Google Scholar]
  • SAS Publishing. 2004. SAS/AF 9.1 procedure guide [electronic resource], Cary, NC: SAS Institute.  [Google Scholar]
  • Schmidt, F. L. 1996. Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods, 1: 115129.  [Crossref], [Web of Science ®][Google Scholar]
  • Seber, G. A. F. and Wild, C. J. 1989. Nonlinear regression, New York: Wiley.  [Crossref][Google Scholar]
  • Shrout, P. E. and Bolger, N. 2002. Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7: 422445.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Smithson, M. 2003. Confidence intervals, Thousand Oaks, CA: Sage.  [Crossref][Google Scholar]
  • Sobel, M. E. 1982. Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13: 290312.  [Crossref][Google Scholar]
  • Sörbom, D. 1974. A general method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Statistical Psychology, 27: 229239.  [Crossref], [Web of Science ®][Google Scholar]
  • Steiger, J. H. 1980a. Testing pattern hypotheses on correlation matrices: Alternative statistics and some empirical results. Multivariate Behavioral Research, 15: 335352.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Steiger, J. H. 1980b. Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87: 245251.  [Crossref], [Web of Science ®][Google Scholar]
  • Steiger, J. H. 1995. “Structural equation modeling (SEPATH)”. In StatSoft Statistica/W 5.0. [computer software], Tulsa, OK: StatSoft.  [Google Scholar]
  • Steiger, J. H. 2001. Driving fast in reverse: The relationship between software development, theory, and education in structural equation modeling. Journal of the American Statistical Association, 96: 331338.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Steiger, J. H. 2004. Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9: 164182.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Steiger, J. H. 2005. “Comparing correlations: Pattern hypothesis tests between and/or within independent samples”. In Contemporary psychometrics: A festschrift for Roderick P. McDonald, Edited by: Maydeu-Olivares, A. and McArdle, J. J. 377414. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • Steiger, J. H. and Fouladi, R. T. 1992. R2: A computer program for interval estimation, power calculation, and hypothesis testing for the squared multiple correlation. Behavior Research Methods, Instruments, and Computers, 4: 581582.  [Crossref][Google Scholar]
  • Steiger, J. H. and Fouladi, R. T. 1997. “Noncentrality interval estimation and the evaluation of statistical models”. In What if there were no significance tests?, Edited by: Harlow, L. L., Mulaik, S. A. and Steiger, J. H. 221257. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]
  • Stoel, R. D., Galindo-Garre, F., Dolan, C. and van den Wittenboer, G. 2006. On the likelihood ratio test in structural equation modeling when parameters are subject to boundary constraints. Psychological Methods, 11: 439455.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • van Zyl, S. M., Heinz, N. and Nel, D. G. 2000. On the distribution of the maximum likelihood estimator of Cronbach's alpha. Psychometrika, 65: 241280.  [Crossref], [Web of Science ®][Google Scholar]
  • Venzon, D. J. and Moolgavkar, S. H. 1988. A method for computing profile-likelihood-based confidence intervals. Applied Statistics, 37: 8794.  [Crossref], [Web of Science ®][Google Scholar]
  • Viechtbauer, W. 2005. Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational and Behavioral Statistics, 30: 261293.  [Crossref], [Web of Science ®][Google Scholar]
  • Wainer, H. 1999. One cheer for null hypothesis significance testing. Psychological Methods, 4: 212213.  [Crossref], [Web of Science ®][Google Scholar]
  • Wilkinson, L. and Task Force on Statistical Inference. 1999. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54: 594604.  [Crossref], [Web of Science ®][Google Scholar]
  • Willett, J. B. and Sayer, A. G. 1994. Using covariance structure analysis to detect correlates and predictors of individual change over time. Psychological Bulletin, 116: 363381.  [Crossref], [Web of Science ®][Google Scholar]
  • Woody, E. and Sadler, P. 2005. Structural equation models for interchangeable dyads: Being the same makes a difference. Psychological Methods, 10: 139158.  [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • World Values Study Group. 1994. World Values Survey, 1981–1984 and 1990–1993 [Computer file], Ann Arbor, MI: Inter-university Consortium for Political and Social Research.  [Google Scholar]
  • Yung, Y. F. A preview of new capabilities of the CALIS procedure for structural equation modeling. Paper presented at the International Meeting of the Psychometric Society. Tilburg, The Netherlands. July.  [Google Scholar]
  • Yung, Y. F. and Bentler, P. M. 1996. “Bootstrapping techniques in analysis of mean and covariance structures”. In Advanced structural equation modeling: Issues and techniques, Edited by: Marcoulides, G. A. and Schumacker, R. E. 195226. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.  [Google Scholar]

APPENDIX A

Mplus CODES FOR THE EXAMPLES

APPENDIX B

Mplus AND Mx SYNTAX FOR THE COMPUTER SIMULATION