alexa Statistical Analysis of Large Cross-Covariance and Cross-Correlation Matrices Produced by fMRI Images | OMICS International
ISSN: 2155-6180
Journal of Biometrics & Biostatistics

Like us on:

Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Statistical Analysis of Large Cross-Covariance and Cross-Correlation Matrices Produced by fMRI Images

Sam Efromovich* and Ekaterina Smirnova

Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, 75083, USA

*Corresponding Author:
Sam Efromovich
Department of Mathematical Sciences
The University of Texas at Dallas, Richardson, TX 75080, USA
Tel: 972-883-2161
E-mail: [email protected]

Received date: February 28 2014; Accepted date: March 25, 2014; Published date: March 31, 2014

Citation: Efromovich S, Smirnova E (2014) Statistical Analysis of Large Cross- Covariance and Cross-Correlation Matrices Produced by fMRI Images. J Biomet Biostat 5:193. doi:10.4172/2155-6180.1000193

Copyright: © 2014 Efromovich S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are are credited.

Visit for more related articles at Journal of Biometrics & Biostatistics


The paper describes the theory, methods and application of statistical analysis of large-p-small-n cross-correlation matrices arising in fMRI studies of neuroplasticity, which is the ability of the brain to recognize neural pathways based on new experience and change in learning. Traditionally these studies are based on averaging images over large areas in right and left hemispheres and then finding a single cross-correlation function. It is proposed to conduct such an analysis based on a voxel-to-voxel level which immediately yields large cross-correlation matrices. Furthermore, the matrices have an interesting property to have both sparse and dense rows and columns. Main steps in solving the problem are: (i) treat observations, available for a single voxel, as a nonparametric regression; (ii) use a wavelet transform and then work with empirical wavelet coefficients; (iii) develop the theory and methods of adaptive simultaneous confidence intervals and adaptive rate-minimax thresholding estimation for the matrices. The developed methods are illustrated via analysis of fMRI experiments and the results allow us not only conclude that during fMRI experiments there is a change in cross-correlation between left and right hemispheres (the fact well known in the literature), but that we can also enrich our understanding how neural pathways are activated and then remain activated in timeon a single voxel-to-voxel level.


Confidence interval; Large-p-small-n; Minimaxity; Motor Cortex; Voxels; Wavelet


Neuroplasticity is the ability of the brain to reorganize neural pathways based on new experiences and learning. Functional MRI is widely used to investigate brain networks and develop possible treatment for patients with stroke, Alzheimer disease, traumatic brain injury, multiple sclerosis, and schizophrenia [1-3].

Neuroplasticity experiments typically involve studies of the brain's activity (both on human and animal subjects) following a specific task. The aim of many experiments is to determine changes in brain's connectivity, which are typically measured via increase in inter hemispheric correlation. This paper is motivated by the analysis of experiments, conducted by the Southwestern Medical school, on motor cortex resting state neuroplasticity experiment on healthy human subjects. These experiments study changes in resting state motor cortex network following the button clicking training. The increased crosscorrelation between the right and left motor cortices in post-training resting state, compared to pre-training resting state, is an indicator of training efficiency.

Let us briey explain two popular approaches in statistical analysis of the fMRI data. The former is to average voxels' signals over all ROI (region of interest in the experiment) for each hemisphere and then work with just two signals. Following this method, for each subject the data would consist of the right and left motor cortex hemisphere time series during pre-training and post-training resting state. Then the cross-correlation is computed using Pearson cross-correlation for both pre-training and post-training averaged signals. The reason why averaging over ROI is popular is because each fMRI experiment involves working with large datasets containing noisy observations. In particular, the studied experimental data involves 11 slices of fMRI scanner data where motor cortex is clearly visible, in addition in each slice there are approximately 80-130 voxels in both the right and left motor cortex. Note that the number of left motor cortex voxels does not typically equal to the number of right motor cortex voxels. Averaging over the region of interest smoothes observed time series signals and simplifies statistical inference, at the same time important information. about voxels' connectivity dynamics is lost due to averaging.

The latter popular approach is based on a voxel-to-voxel analysis via computing Pearson correlation between every pair of symmetric inter-hemispheric voxels' time-series [4,5]. The correlations reect voxel-mirrored homotopic connectivity (VMHC). VMHC assumes a symmetric morphology between hemispheres, and because this assumption does not hold for human brains, images must be transformed before VMHC can be calculated. Then global and regional group differences in VMHC are examined. Global VMHC is calculated by averaging VMHC values across all brain voxels within a unilateral hemispheric grey matter mask (there is only one correlation for each pair of homotopic voxels). Group comparisons of global VMHC are performed using t-tests. Again, due to multiple comparisons problem the voxel-based correlations are averaged. Using t-test for regional group differences in VMHC, individual-level VHMC maps are entered into a group-level voxel-wise t-test analysis using a mixed-effects ordinary least squares model. Multiple comparisons corrections are performed using Gaussian random field theory [6]. Note that this methodology assumes that observed time series should be sufficiently smooth.

This paper proposes, for neuroplasticity experiments, to measure connectivity between left and right hemispheres via cross-correlation between all pairs of voxels in right and left hemisphere regions of interest. This approach yields estimation of large cross-correlation matrices with over 1000 elements. Further, following a standard data pre-processing methodology, a bandpass filtering at 0.01-0.1 Hz is used to identify hemodynamic response [7]. Furthermore, because an observed time series signal for an individual voxel is contaminated by a relatively large noise, new methods of statistical analysis and inference are required.

We propose a new statistical methodology based on voxel-tovoxel analysis of fMRI data via wavelet decomposition approach. In particular, we suggest treating an observed voxel’s signal as a nonparametric regression and then using a wavelet transform to express crosscorrelation via empirical wavelet coefficients. This immediately creates a problem of estimation of large matrices based on relatively small sample sizes. Interesting specific of the problem at hand is that rows (and columns) of a cross-correlation matrix may be sparse and dense. To solve this problem, it is proposed to combine two methodologies developed in this paper: adaptive and rate-minimax estimation of sparse rows and adaptive simultaneous confidence intervals for elements of large cross-correlation matrices. Both the methodology and asymptotic results are presented and then illustrated on the practical fMRI example.

Using a wavelet approximation for signal processing is well known [8-10]. It is a reliable method for filtering a signal from noise. There is also a literature devoted to wavelet analysis of stationary time series and estimation cross-spectrums. In [11] a consistent crosscorrelation estimate is proposed, in [12] estimation of a bivariate time series correlation is done via spectral-domain approach. To the best of our knowledge, there is no literature on wavelet-based estimation of large cross-correlation matrices with both sparse and dense rows, where the underlying model is a nonparametric regression. On the other hand, there is large and growing literature on rate-minimax estimation of traditional correlation matrix with sparse rows; see reviews and discussion in [13,14].

The context of the paper is as follows. Section Methods and Theory introduces notation, models, methodology of estimation and asymptotic results. Section fMRI Application describes the applied problem. Section Results presents and discusses main applied results obtained with the help of the developed methodology. Proofs can be found in the Appendix.

Methods and Theory

In signal processing, cross-correlation is the measure of similarity between two functions. This measure is of interest in various applications including pattern recognition, single particle analysis, electron tomographic imaging, cryptanalysis and neurophysiology. For continuous functions, f and g, the cross-correlation is defined as:


Suppose that two time series signals, say equation and equation , are observed at n equally spaced time points. Without any loss of generality we can assume that the signals are observed for t ∈ [0, 1] and hence we observe equation and equation , l = 1,2,…,n. Let us additionally assume that n is dyadic (otherwise we choose the largest dyadic number not larger than n), in our fMRI example n=256=28. Then it is proposed to approximate equation and equation via corresponding wavelet multiresolution expansions,


Here ϕ(t) and ψ(t) are the scaling (father) function and mother wavelet which generate a wavelet basis on [0,1], equation, equation and the wavelet coefficients equation are calculated using observations of the signals at n equidistant points via a cascade algorithm supported by all wavelet software packages [8,9]. In our fMRI example we use j0 = 5.

Then it is suggested to use a band-pass filter to consider only scales in the wavelet expansion that are of interest in an underlying practical example. For instance, in the fMRI example the rational, for choosing a particular set of scales, is based on the frequency domain of hemodynamic response. In frequency domain hemodynamic response corresponds to the range of 0.01-0.1 Hz, and for our particular sample size, this corresponds to the set of scales 3, 4 and 5 [7]. As a result, from now on we are working not with signals (1) but with

equation                              (2)

where equation Correspondingly, available observations for the right and left hemispheres are equation and equation , l = 1,2,…,n. Examples of original observed signals equation and filtered signals equation can be found in Figure 1. As we see, the filtering does help us to visualize hemodynamic responses and remove both the trend and high frequency noise. Furthermore, the underlying deterministic signals of interest are

equation                              (3)


Figure 1:Original observed and filtered signals. A left hemisphere voxel's signal is shown by the solid line and a right hemisphere voxel's signal is shown by the dashed line. Top diagrams exhibit observed and filtered signals of two highly activated voxels. Bottom diagrams exhibit observed and filtered signals of two voxels with small cross-correlation.

Where equation, equation are the corresponding wavelet coefficients. It is assumed that equation and we are considering two classical 5 nonparametric regressions with Gaussian errors. Then, according to [15,16], we get the following Gaussian sequence model for empirical wavelet coefficients,

equation                              (4)

where all considered ξjk and ηjk are mutually independent standard normal. Justification of the made assumption and model (4) for the studied fMRI application can be found in Efromovich and Valdez- Jasso [17] and Valdez-Jasso [18].

Furthermore, there are total p1 right hemisphere voxels and p2 left hemisphere voxels, with typical numbers for p1 and p2 around one hundred. Note that p1 and p2 are not necessarily equal, and then the total number of considered cross-correlations is p1 × p2. Another important remark for our practical example is that we do not use an assumed spatial correlation of signals because it is of interest to: (i) understand how and when voxels in motor-cortex are activated; (ii) test the developed methodology for an arbitrary large- p-small-n setting; (iii) use a traditionally assumed spatial correlation to test the obtained results.

Assume that the underlying functions of interest Y(t) and X(t) are integrated to zero on [0,1] (this is the case in our fMRI example), that is equation. The first problem is to estimate, based on observations equation , l=1,…,n and equation , l=1,…,n, or correspondingly observations (4), the cross-covariance between the underlying functions Y(t) and X(t) which is defined as

equation                              (5)

as well as the corresponding cross-correlation between underlying functions Y(t) and X(t) which is defined as

equation                              (6)

The related problem is to estimate a cross-covariance matrix between p1 signals Y(t) and p2 signals X(t) based on p1 filtered time series equation and p2 filtered time series equation. In this setting, let let Y={Y1,Y2,…,Yp1} be a p1 dimensional vector where each Yr is a time series signal with n observations and X = {X1,X1,…,Xp2} be a p2 dimensional vector where each Xr is a time series signal based on n observations. Our goal is to estimate a corresponding p1 × p2 cross-covariance matrix. Complexity of the problem is defined by large p1p2 and small n setting, as well as the presence of both sparse and dense rows (columns) in the estimated matrices.

Now let us describe the proposed methodology of estimation. We begin with estimation of cross-covariance (5). First, we rewrite (5), using the Parseval's identity and corresponding wavelet expansions (3), as


Then, according to (4), a natural estimator of σYX is


where empirical wavelet coefficients equation and equation are defined in (2).

To estimate cross-correlation (6) between signals Y(t) and X(t), we use a plug-in estimate


where equation and equation are unbiased estimates of σYY and σXX , respectively, and they are defined as


where equation and equation are mad estimators [8] of parameters τ and ν based on empirical wavelet coefficients on the finest scale.

The following theorem presents recommended confidence intervals. Definition of statistics equation and equation , used in the theorem, can be found in the Appendix. Let us also note that when we consider p1×p2 matrices, then we may use notation equation, σij, etc. for corresponding ith and jth signals Y and X.

Theorem 1. (a) (1‒α) confidence interval for a cross-covariance σYX is


where zα/2 is the (1‒α/2) quantile of the standard normal distribution.

(b) Simultaneous (1‒α) confidence interval for cross-covariances σij, i ∈{1,2,…,p1}, j ∈{1,2,…,p2} is


where γα is defined as solution of the equation


(c) (1‒α) confidence interval for a cross-correlation ρYX is


(d) Simultaneous (1‒α) confidence interval for cross-correlation ρij, i ∈ {1,2,…,p1},

j ∈ {1,2,…,p2} is


where γα is defined in (b).

Next theoretical result is about rate-minimax estimation of sparse cross-covariances matrices. Following [13,14], let us introduce a class of p1×p2 cross-covariance matrices


Theorem 2. Thresholding estimator


is rate minimax over the class uq(s0(p1, p2)), that is, for some finite constant C>0


The rate of the upper bound cannot be improved.

Note that in Theorem 2 the traditional matrix l1-norm is used as the loss function.

fMRI Application

Researchers at the Southwestern Medical School explored the possibility of using alterations in resting brain activity for finding potential markers for brain plasticity; details of the study can be found Tung et al. [3] and here we just briey describe them. Using 3T MR scanner, blood-oxygenation-level-dependent (BOLD) signal fMRI data were obtained for healthy right-handed adults. The motor cortex model studies were based on a five minute pre-training session, followed by a 23 minutes training session, and finished with a 5 minute post-training run. During the training session, a subject was shown a white cross-hair and was instructed to press a button three times with the thumb when a change in color was detected. The color change occurred randomly every 27-32 seconds, and during the motor task period there were a total of 40 times when color change occurred. The raw data from the experiment consists of 300 time points for each of pre and post training sessions, and four sessions of 340 time points during the training part; the data was measured every second. All data was transformed into Talairach coordinates using AFNI program and standardized mask region of interest (ROI) applied to all subjects to select the motor cortex voxels' time series. This preprocessed data was used in our analysis. The schematic experimental setup is shown in Figure 2.


Figure 2: Experimental Setup. The experiment starts with a pre-training fMRI scans taken every 1 sec., followed by a button clicking training session, and finished with a post-training rest stage.

Clearly, the averaging over all voxels in ROI smooths the time series signal and avoids the problem of simultaneous estimation of elements of a large cross-correlation matrix. On the other hand, in each ROI slice there are 80-130 voxels in the left and right motor cortex areas and there are 11 slices where ROI is clearly visible on fMRI scan. The aim is to understand how this information can used to shed a new light on the neural plasticity.

Following results described in the Methods section, we obtain wavelet decomposition for all voxels in the left and right hemisphere ROI. Each voxel in pre and post training sessions consist of 300 time points, and because a wavelet decomposition can be obtained for a signal of length n=2J, only 256 time points are used to calculate wavelet coefficients. Then a signal consisting of 256 time points is decomposed into 6 wavelet scales, one father scale and 5 mother scales. Furthermore, because we are interested only in a hemodynamic response, in fMRI data processing a BOLD signal is commonly bandpass-filtered to 0.01- 0.1 Hz, see [7,19,20]. The studied case of size n=256 implies considering scales 3, 4, and 5 corresponding to frequencies of the hemodynamic response. In other words, for our applied example a sum like equation which we are using throughout the paper, is defined as (compare with (3))

equation                              (7)

In what follows we also use the convention that Y (t) is a signal in a right hemisphere and X(t) is in the left one. Adaptive covariance thresholding estimator equation introduced in Theorem 2, is applied to fMRI data for each ijth voxel pair (i-right voxel, j-left voxel) to identify which pairs of covariances are significantly positive. In the thresholding rule we use p1 as the number of voxels in the right motor cortex, p2 as the number of voxels in the left motor cortex, and in all presented figures a single fMRI slice is considered. Theorem 1 allows us to construct confidence intervals for elements of a cross-correlation matrix. In what follows, statistical methodology, developed in this paper, is illustrated via a study of fMRI conducted on a single volunteer.


The fMRI experiment, described in the previous section, consists of 6 parts. The first one is a pre-training session, when a right-handed adult is relaxed. This session allows us to understand how voxels in left and right motor cortices are connected during no activity time. Next 4 sessions are training sessions when a participant is asked to press a button, by right hand thumb, three times when a color signal is activated. Note that the left motor cortex is solely responsible for the movement of the right thumb, and therefore it is of interest how right motor cortex may react on the movement. Namely, the activation sessions allow us to understand what changes occur in pathways between voxels in left and right motor cortices. In particular, we would like to understand how many new (if any) pathways are created and what changes in cross-correlations are happening. Finally, during the post-training session, no activity occurs and we would like to understand how voxels in the left and right motor cortices “relax" after the training sessions.

Figure 3 shows us estimated cross-correlation matrices for a single slice of the left and right motor cortexes during the fMRI experiment. Each matrix corresponds to a particular session and the description can be found in the caption. For the reader's convenience, the left two diagrams show cross-correlation matrices for pre-training and post-training sessions. As we see, the activation by clicking a button causes increase in the number of activated pathways between the left an right hemispheres. Furthermore, the level of cross-correlations is dramatically increased. The third diagram exhibits the cross-correlation matrix for the first training session. We see a significant increase in cross-correlations between some pairs of voxels with respect to both pre-training and post-training sessions. On the other hand, the total number of activated pathways is about the same with the post-training session. The outcome changes rather dramatically during the second activation session (see the fourth from the left diagram) when the level of cross-correlations and the number of activated pathways dominates all other sessions. The level of cross-correlations and the number of activated pathways between left and right hemispheres remain relatively high during last two activation sessions, but the activity clearly decreases in comparison with the second activation session.


Figure 3: Estimates of cross-correlation matrices. In each diagram the y-axis corresponds to voxels from the right motor cortex and the x-axis to voxels from the left motor cortex. A larger cross- correlation yields a whiter (brighter) color. The left two diagrams show matrices for pre-training and post-training. The third to the sixth diagrams exhibit cross-correlation matrices for the four training sessions.

Figure 4 allows us to visualize changes in the number of activated pathways from pretraining to post-training sessions. As we see, there is a significant increase in the number of pathways corresponding to larger cross-covariance values while the number of pathways between pairs with small cross-correlation values remains about the same.


Figure 4: Two frequency histograms of cross-correlations for one voxel from left hemisphere and all voxels from right hemisphere. The dashed and solid lines correspond to pre-training and post-training sessions.

Figure 5 presents the geometry of locations of activated pairs of voxels during pre- and post-training sessions. Here voxels are arrange according to their geometric location in motor-cortices, and only pairs with cross-correlation larger than 0.6 are shown by white (brighter) color. As we see, our estimation procedure nicely defines areas of activated voxels, and furthermore we see a significant increase of activated pathways as well as the increased level of their activation during the post-training session.


Figure 5: Cross-covariance matrices for pre-training (the left diagram) and post-training sessions.Only cells with cross-covariance larger than 0.6 are shown. In the diagrams voxels are arrange according to their geometric location in corresponding motor cortexes, axial view.

Figure 6 allows us to zoom diagrams in Figure 5 and look at the geometry of activated pairs more closely. It also contains similar "zooms" for the four activation sessions. What we see is that, during the post-training session, a dramatically larger number of voxel pairs is activated. Interestingly, the first activation session does not impress us in this respect, but the second activation session does. Furthermore, the activation slows down during two last activation sessions but still the relationship between left and right hemispheres is very strong.


Figure 6: Zoomed parts of shown in Figure 4 correlation matrices containing highly activated (cross-covariance larger 0.6) pairs of voxels. The parts are shown in the two left diagrams. Similar zooms for the four activation sessions are shown in the four right diagrams.

Finally, Figure 7 shows us adaptive simultaneous confidence intervals for cross-correlations during pre- and post-training sessions. First of all, we clearly see the increase in pairs of activated voxels with larger cross-correlations. Furthermore, we can see how the width of a simultaneous confidence interval adapts to specific characteristics of a particular voxel. The later may be more clearly seen in the right diagram.


Figure 7: Simultaneous confidence intervals for cross-correlations. Cross-correlations are shown for one right hemisphere voxel and all left hemisphere voxels from the region of interest. Left and right diagrams correspond to pre-training and post-training sessions, respectively.

We may conclude that the proposed methodology of estimation and inference for crosscorrelation matrices performs well and does shed light on the dynamic of pathways between left and right hemispheres. A simple "clicking-the button" exercise does change the brain network. Furthermore, in addition to the conclusion of Tung et al. [3], that training of the brain increases the correlation between averaged activities in the right and left hemispheres, we can also see specific areas of activation and the level of activation on a voxel-byvoxel level. The latter may be of a special interest for finding a cure for brain diseases because the proposed methodology points upon specific areas of the brain that can be trained to substitute for disease-damaged areas of the brain.


FMRI study of neural paths connecting left and right hemispheres involves estimation of large cross-correlation matrices based on a relatively small data. Furthermore, these matrices may contain both sparse and dense rows and are contaminated by large noise. To estimate such matrices, it is proposed to treat observations, available for a single voxel, as a nonparametric regression, then apply a wavelet transform and work with empirical wavelet coefficients to estimate elements of the cross-correlation matrix, and finally use the methodology of adaptive thresholding and adaptive simultaneous confidence intervals. The proposed methodology is supported by the asymptotic theory which, in particular, shows rate-minimaxity of the adaptive thresholding procedure.

Analysis of fMRI data indicates that the proposed estimators allow us to understand how neural paths change during learning process. Namely, we can visualize both activation of new paths and change in activity of old paths, described by increased cross-correlations. The proposed minimax threshold estimator allows us to eliminate insignificant elements in the cross-covariance function, and therefore reduce dimensionality of the cross correlation matrix of interest. Then the number of comparisons in a cross-correlation matrix can be adjusted accordingly, making the width of simultaneous confidence intervals shorter. The proposed methodology is rather universal and can be used for estimation of large p1×p2 cross-correlation matrices whenever a wavelet approximation of underlying signals is feasible.


The research was supported by NSF Grant DMS-0906790 and NSA Grant H982301310212.


Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

Article Usage

  • Total views: 12194
  • [From(publication date):
    April-2014 - May 23, 2018]
  • Breakdown by view type
  • HTML page views : 8376
  • PDF downloads : 3818

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2018-19
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals


[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

© 2008- 2018 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
Leave Your Message 24x7