In a previous post, all commonly used univariate and multivariate test statistics used with the general linear model (GLM) were presented. Here an alternative formulation for one of these statistics, the Pillai’s trace (Pillai, 1955, references at the end), commonly used in MANOVA and MANCOVA tests, is presented.
We begin with a multivariate general linear model expressed as:
where is the
full rank matrix of observed data, with
observations of
distinct (possibly non-independent) variables,
is the full-rank
design matrix that includes explanatory variables (i.e., effects of interest and possibly nuisance effects),
is the
vector of
regression coefficients, and
is the
vector of random errors. Estimates for the regression coefficients can be computed as
, where the superscript (
) denotes a pseudo-inverse.
The null hypothesis, and a simplification
One is generally interested in testing the null hypothesis that a contrast of regression coefficients is equal to zero, i.e., , where
is a
full-rank matrix of
contrasts of coefficients on the regressors encoded in
,
and
is a
full-rank matrix of
contrasts of coefficients on the dependent, response variables in
,
; if
or
, the model is univariate. Once the hypothesis has been established,
can be equivalently redefined as
, such that the contrast
can be omitted for simplicity, and the null hypothesis stated as
.
Model partitioning
It is useful to consider a transformation of the model into a partitioned one:
where is the matrix with regressors of interest,
is the matrix with nuisance regressors, and
and
are respectively the vectors of regression coefficients. From this model we can also define the projection (hat) matrices
and
due to tue regressors of interest and nuisance, respectively, and the residual-forming matrices
and
.
Such partitioning is not unique, and schemes can be as simple as separating apart the columns of as
, with
. More involved strategies can, however, be devised to obtain some practical benefits. One such partitioning is to define
and
, where
,
, and
has
columns that span the null space of
, such that
is a
invertible, full-rank matrix (Smith et al, 2007). This partitioning has a number of features:
,
, i.e., estimates and variances of
for inference on the partitioned model correspond exactly to the same inference on the original model,
is orthogonal to
, and
, i.e., the partitioned model spans the same space as the original.
Another partitioning scheme, derived by Ridgway (2009), defines and
. As with the previous strategy, the parameters of interest in the partitioned model are equal to the contrast of the original parameters. A full column rank nuisance partition can be obtained from the singular value decomposition (SVD) of
, which will also provide orthonormal columns for the nuisance partition. Orthogonality between regressors of interest and nuisance can be obtained by redefining the regressors of interest as
.
The usual multivariate statistics
For the multivariate statistics, define generically:
as the sums of products explained by the model (hypothesis) and:
as the sums of the products of the residuals, i.e., that remain unexplained. With the simplification to the original model that redefined as
, the
can be dropped, so that we have
and
. The various well-known multivariate statistics (see this earlier blog entry) can be written as a function of
and
. Pillai’s trace is:
More simplifications
With the partitioning, other simplifications are possible:
Recalling that , and defining
, we have:
The unexplained sums of products can be written in a similar manner:
The term in the Pillai’s trace can therefore be rewritten as:
Using the property that the trace of a product is invariant to a circular permutation of the factors, Pillai’s statistic can then be written as:
The final, alternative form
Using sigular value decomposition we have and
, where
contains only the columns that correspond to non-zero eigenvalues. Thus, the above can be rewritten as:
The SVD transformation is useful for languages or libraries that offer a fast implementation. Otherwise, using a pseudoinverse yields the same result, perhaps only slightly slower. In this case, .
Importance
If we define and
(or
), then
. The first three moments of the permutation distribution of statistics that can be written in this form can be computed analytically once
and
are known. With the first three moments, a gamma distribution (Pearson type III) can be fit, thus allowing p-values to be computed without resorting to the usual beta approximation to Pillai’s trace, nor using permutations, yet with results that are not based on the assumption of normality (Mardia, 1971; Kazi-Aoual, 1995; Minas and Montana, 2014).
Availability
This simplification is available in PALM, for use with imaging and non-imaging data, using Pillai’s trace itself, or a modification that allows inference on univariate statistics. As of today, this option is not yet documented, but should become openly available soon.
References
- Kazi-Aoual F, Hitier S, Sabatier R, Lebreton J-D. Refined approximations to permutation tests for multivariate inference. Comput Stat Data Anal. 1995;20(94):643–56.
- Mardia KV. The Effect of Nonnormality on some multivariate tests and robustness to nonnormality in the linear model. Biometrika. 1971;58(1):105–21.
- Minas C, Montana G. Distance-based analysis of variance: Approximate inference. Stat Anal Data Min. 2014;7(6):450–70.
- Pillai KCS. Some New test criteria in multivariate analysis. Ann Math Stat. 1955;26(1):117–21.
- Ridgway GR. Statistical analysis for longitudinal MR imaging of dementia. PhD Thesis. University College London, 2009.
- Smith SM, Jenkinson M, Beckmann CF, Miller K, Woolrich M. Meaningful design and contrast estimability in FMRI. Neuroimage. 2007;34(1):127–36.
- Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE. Permutation inference for the general linear model. Neuroimage. 2014;92:381–97.
Update: 20.Jan.2016: A slight simplification was applied to the formulas above so as to make them more elegant and remove some redundancy. The result is the same.