In a previous post, all commonly used univariate and multivariate test statistics used with the general linear model (GLM) were presented. Here an alternative formulation for one of these statistics, the Pillai’s trace (Pillai, 1955, references at the end), commonly used in MANOVA and MANCOVA tests, is presented.
We begin with a multivariate general linear model expressed as:
where is the full rank matrix of observed data, with observations of distinct (possibly non-independent) variables, is the full-rank design matrix that includes explanatory variables (i.e., effects of interest and possibly nuisance effects), is the vector of regression coefficients, and is the vector of random errors. Estimates for the regression coefficients can be computed as , where the superscript () denotes a pseudo-inverse.
The null hypothesis, and a simplification
One is generally interested in testing the null hypothesis that a contrast of regression coefficients is equal to zero, i.e., , where is a full-rank matrix of contrasts of coefficients on the regressors encoded in , and is a full-rank matrix of contrasts of coefficients on the dependent, response variables in , ; if or , the model is univariate. Once the hypothesis has been established, can be equivalently redefined as , such that the contrast can be omitted for simplicity, and the null hypothesis stated as .
Model partitioning
It is useful to consider a transformation of the model into a partitioned one:
where is the matrix with regressors of interest, is the matrix with nuisance regressors, and and are respectively the vectors of regression coefficients. From this model we can also define the projection (hat) matrices and due to tue regressors of interest and nuisance, respectively, and the residual-forming matrices and .
Such partitioning is not unique, and schemes can be as simple as separating apart the columns of as , with . More involved strategies can, however, be devised to obtain some practical benefits. One such partitioning is to define and
, where , , and has columns that span the null space of , such that is a invertible, full-rank matrix (Smith et al, 2007). This partitioning has a number of features: , , i.e., estimates and variances of for inference on the partitioned model correspond exactly to the same inference on the original model, is orthogonal to , and , i.e., the partitioned model spans the same space as the original.
Another partitioning scheme, derived by Ridgway (2009), defines and . As with the previous strategy, the parameters of interest in the partitioned model are equal to the contrast of the original parameters. A full column rank nuisance partition can be obtained from the singular value decomposition (SVD) of , which will also provide orthonormal columns for the nuisance partition. Orthogonality between regressors of interest and nuisance can be obtained by redefining the regressors of interest as .
The usual multivariate statistics
For the multivariate statistics, define generically:
as the sums of products explained by the model (hypothesis) and:
as the sums of the products of the residuals, i.e., that remain unexplained. With the simplification to the original model that redefined as , the can be dropped, so that we have and . The various well-known multivariate statistics (see this earlier blog entry) can be written as a function of and . Pillai’s trace is:
More simplifications
With the partitioning, other simplifications are possible:
Recalling that , and defining , we have:
The unexplained sums of products can be written in a similar manner:
The term in the Pillai’s trace can therefore be rewritten as:
Using the property that the trace of a product is invariant to a circular permutation of the factors, Pillai’s statistic can then be written as:
The final, alternative form
Using sigular value decomposition we have and , where contains only the columns that correspond to non-zero eigenvalues. Thus, the above can be rewritten as:
The SVD transformation is useful for languages or libraries that offer a fast implementation. Otherwise, using a pseudoinverse yields the same result, perhaps only slightly slower. In this case, .
Importance
If we define and (or ), then . The first three moments of the permutation distribution of statistics that can be written in this form can be computed analytically once and are known. With the first three moments, a gamma distribution (Pearson type III) can be fit, thus allowing p-values to be computed without resorting to the usual beta approximation to Pillai’s trace, nor using permutations, yet with results that are not based on the assumption of normality (Mardia, 1971; Kazi-Aoual, 1995; Minas and Montana, 2014).
Availability
This simplification is available in PALM, for use with imaging and non-imaging data, using Pillai’s trace itself, or a modification that allows inference on univariate statistics. As of today, this option is not yet documented, but should become openly available soon.
References
- Kazi-Aoual F, Hitier S, Sabatier R, Lebreton J-D. Refined approximations to permutation tests for multivariate inference. Comput Stat Data Anal. 1995;20(94):643–56.
- Mardia KV. The Effect of Nonnormality on some multivariate tests and robustness to nonnormality in the linear model. Biometrika. 1971;58(1):105–21.
- Minas C, Montana G. Distance-based analysis of variance: Approximate inference. Stat Anal Data Min. 2014;7(6):450–70.
- Pillai KCS. Some New test criteria in multivariate analysis. Ann Math Stat. 1955;26(1):117–21.
- Ridgway GR. Statistical analysis for longitudinal MR imaging of dementia. PhD Thesis. University College London, 2009.
- Smith SM, Jenkinson M, Beckmann CF, Miller K, Woolrich M. Meaningful design and contrast estimability in FMRI. Neuroimage. 2007;34(1):127–36.
- Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE. Permutation inference for the general linear model. Neuroimage. 2014;92:381–97.
Update: 20.Jan.2016: A slight simplification was applied to the formulas above so as to make them more elegant and remove some redundancy. The result is the same.