While the CCA and SVD models were based on maximising correlations and covariance, the regression models aim to minimise the root mean square (RMS) error. The regression models are based on the least squares solutions to an inconsistent system:
of n equations with q unknowns, and where
represents a noise term. If the noise term is insignificant then the linear expression is satisfied by the normal equations:
The matrix product
is invertible if the columns of
are independent, and we can express
in terms of
and
only [, pp.156]:
Equation 8.36 may, however, involve significant noise levels and a ``true'' estimate of
is then:
where
is the error-covariance matrix. One problem is that we only have an estimate of
if
is known (
).
may also be non-invertible. We can get around these problems by excluding the noise term from the analysis, and only attempt to predict the signal in
that is related to
, which we refer to as
.
By applying PCA to the data and truncating to the kth leading EOF, we also remove noise in
and ensure that
is invertible by writing the matrix in terms of its PCA products,
Hence equation 8.39 can be expressed as:
MVR results are in general similar to the BP-CCA results. See Chapter 7 for more discussion on linear and multi-variate regression techniques.