Statistics
Statistical functions
| Option | Description |
|---|---|
| title | stats.py |
| authors | Florence Brun, Guillaume Dumas |
| date | 2020-03-18 |
con_matrix(epochs, freqs_mean, draw=False)
Compute a priori channel connectivity across space and frequencies.
This function creates connectivity matrices that define which channels and frequencies should be considered neighbors for cluster-based statistics.
Parameters
epochs : mne.Epochs Epochs object containing channel information
List[float]
List of frequencies in the frequency-band-of-interest used for power or coherence spectral density calculation
bool, optional
Whether to plot the connectivity matrices (default=False)
Returns
con_matrixTuple : namedtuple A named tuple containing: - ch_con: Connectivity matrix between channels in space, scipy.sparse.csr_matrix of shape (n_channels, n_channels) - ch_con_freq: Connectivity matrix between channels across space and frequencies, scipy.sparse.csr_matrix of shape (n_channelslen(freqs_mean), n_channelslen(freqs_mean))
Notes
The channel connectivity matrix (ch_con) is based on the spatial adjacency of EEG electrodes - channels that are physically adjacent are considered connected.
The frequency-space connectivity matrix (ch_con_freq) extends this spatial adjacency to include frequency adjacency - neighboring frequencies for the same channel are also considered connected.
These connectivity matrices are used as inputs to cluster-based statistical functions to define the neighborhood structure for clustering.
Examples
Create connectivity matrices for alpha band frequencies
alpha_freqs = np.arange(8, 13) conn = con_matrix(epochs, alpha_freqs, draw=True) ch_con = conn.ch_con # Channel spatial connectivity ch_con_freq = conn.ch_con_freq # Channel-frequency connectivity
metaconn_matrix(electrodes, ch_con, freqs_mean)
Compute a priori connectivity between pairs of sensors within one brain.
This function creates connectivity matrices for pairs of channels within a single brain, taking into account spatial adjacency for cluster-based statistics.
Parameters
electrodes : List[Tuple[int, int]] List of electrode pairs for which connectivity indices have been computed. Each tuple contains the indices of two channels from the same participant.
scipy.sparse.csr_matrix
Connectivity matrix between channels in space, typically from con_matrix()
List[float]
List of frequencies in the frequency-band-of-interest
Returns
metaconn_matrixTuple : namedtuple A named tuple containing: - metaconn: Connectivity matrix between channel pairs, array of shape (len(electrodes), len(electrodes)) - metaconn_freq: Connectivity matrix between channel pairs across space and frequencies, array of shape (len(electrodes)len(freqs_mean), len(electrodes)len(freqs_mean))
Notes
This function determines whether two channel pairs are connected based on the spatial adjacency of their constituent channels. It considers various combinations of adjacency between the channels.
The resulting connectivity matrices define the neighborhood structure for cluster-based statistics on connectivity data within a single brain.
Examples
Create metaconnectivity matrices for intrabrain connectivity
electrode_pairs = indices_connectivity_intrabrain(epochs) metaconn = metaconn_matrix( ... electrode_pairs, ch_con.ch_con, freqs_mean=[10] ... )
metaconn_matrix_2brains(electrodes, ch_con, freqs_mean, plot=False)
Compute a priori connectivity matrices for hyperscanning analyses.
This function creates connectivity matrices for pairs of channels across two brains (participants), taking into account spatial adjacency within each brain but assuming no direct connectivity between brains.
Parameters
electrodes : List[Tuple[int, int]] List of electrode pairs for which connectivity indices have been computed. Each tuple contains the indices of channels from participant 1 and participant 2.
scipy.sparse.csr_matrix
Connectivity matrix between channels in space, typically from con_matrix()
List[float]
List of frequencies in the frequency-band-of-interest
bool, optional
Whether to plot the connectivity matrices (default=False)
Returns
metaconn_matrix_2brainsTuple : namedtuple A named tuple containing: - metaconn: Connectivity matrix between channel pairs, array of shape (len(electrodes), len(electrodes)) - metaconn_freq: Connectivity matrix between channel pairs across space and frequencies, array of shape (len(electrodes)len(freqs_mean), len(electrodes)len(freqs_mean))
Notes
This function assumes there is no a priori connectivity between channels from different participants. It considers two channel pairs connected if: 1. The respective channels within each participant are connected, or 2. Some channels are identical across the pairs
The resulting connectivity matrices define the neighborhood structure for cluster-based statistics on hyperscanning data.
Examples
Create metaconnectivity matrices for interbrain connectivity
electrode_pairs = indices_connectivity_interbrain(epochs_hyper) metaconn = metaconn_matrix_2brains( ... electrode_pairs, ch_con.ch_con, freqs_mean=[10], plot=True ... )
statsCond(data, epochs, n_permutations, alpha)
Perform statistical t-test on participant measures (e.g., PSD) for a condition.
This function tests whether the observed mean significantly deviates from 0 using a permutation-based t-test with False Discovery Rate (FDR) correction for multiple comparisons.
Parameters
data : np.ndarray Array of participant measures with shape (n_samples, n_tests, n_freq) where n_tests typically represents channels and n_freq the frequencies. Values will be averaged across frequencies for statistics.
mne.Epochs
Epochs object for a condition from a random participant, used only to access information like channel positions.
int
Number of permutations for the statistical test Should be at least 2*n_samples, e.g., 50000
float
Significance threshold for the test, e.g., 0.05
Returns
statsCondTuple : namedtuple A named tuple containing: - T_obs: T-statistic observed for all variables, array of shape (n_tests) - p_values: p-values for all tests, array of shape (n_tests) - H0: T-statistics obtained by permutations, array of shape (n_permutations) - adj_p: tuple with boolean assessment of significance and FDR-corrected p-values - T_obs_plot: statistical values for significant sensors, array of shape (n_tests)
Notes
This test calculates if the observed mean significantly deviates from 0; it doesn't compare two periods, but tests one period against the null hypothesis.
Randomized data are generated with random sign flips, and the test is two-tailed by default (the alternative hypothesis is that the data mean is different from 0).
To reduce false positives due to multiple comparisons, False Discovery Rate (FDR) correction is applied to the p-values.
The frequency dimension is reduced to one for the test (average in the frequency band-of-interest). To take frequencies into account, use cluster statistics (see statscondCluster function).
Examples
Independent t-test between two groups
ind_ttest_results = statscluster( ... [group1_data, group2_data], ... test='ind ttest', ... factor_levels=None, ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )
2×2 repeated measures ANOVA (within-subjects design)
Factor 1: Condition (2 levels), Factor 2: Time (2 levels)
anova_results = statscluster( ... data_array_2x2, # Shape: (4, n_subjects, n_features) ... test='f multipleway', ... factor_levels=[2, 2], ... ch_con_freq=connectivity.ch_con_freq, ... tail=1, # F-tests are one-tailed ... n_permutations=10000, ... alpha=0.05 ... )
statscluster(data, test, factor_levels, ch_con_freq, tail, n_permutations, alpha=0.05)
Perform cluster-based statistical tests with various test statistics.
This function provides a flexible interface to cluster-based permutation tests, supporting different statistical tests including t-tests, one-way ANOVA, and multiple-way ANOVA for complex experimental designs.
Parameters
data : list or np.ndarray For t-tests and one-way ANOVA: list of arrays containing values from different groups or conditions to compare For multiple-way ANOVA: numpy array organized by factors
str
Type of statistical test to use: - 'ind ttest': Independent samples t-test - 'rel ttest': Related (paired) samples t-test - 'f oneway': One-way ANOVA - 'f multipleway': Multiple-way ANOVA
List[int] or None
For multiple-way ANOVA, list specifying the number of levels for each factor (e.g., [2, 3] for a 2×3 design with 2 levels of factor 1 and 3 levels of factor 2) Set to None for other tests
scipy.sparse.csr_matrix
Connectivity or metaconnectivity matrix defining adjacency across space and frequencies
int
Direction of the test: - 0: two-tailed test (must be used for f oneway) - 1: one-tailed test (greater) (must be used for f multipleway) - -1: one-tailed test (less)
int
Number of permutations for the statistical test, e.g., 50000
float, optional
Significance threshold for clusters (default=0.05)
Returns
statscondClusterTuple : namedtuple A named tuple containing: - Stat_obs: Observed statistic (T or F) for all variables - clusters: Boolean array indicating locations in significant clusters - cluster_p_values: p-value for each identified cluster - H0: Max cluster-level statistics under permutation - Stat_obs_plot: Statistical values for significant sensors
Notes
For multiple-way ANOVA with connectivity values, the last dimensions may need to be flattened from shape (n_sensors, n_sensors) to a vector using np.reshape.
The function applies different thresholding approaches based on the test type: - For t-tests: Uses the alpha value directly - For one-way ANOVA: Uses the alpha value directly - For multiple-way ANOVA: Calculates appropriate F-thresholds based on factor levels
With t_power=1, each location is weighted by its statistical score within a cluster.
Examples
Independent t-test between two groups
ind_ttest_results = statscluster( ... [group1_data, group2_data], ... test='ind ttest', ... factor_levels=None, ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )
2×2 repeated measures ANOVA (within-subjects design)
Factor 1: Condition (2 levels), Factor 2: Time (2 levels)
anova_results = statscluster( ... data_array_2x2, # Shape: (4, n_subjects, n_features) ... test='f multipleway', ... factor_levels=[2, 2], ... ch_con_freq=connectivity.ch_con_freq, ... tail=1, # F-tests are one-tailed ... n_permutations=10000, ... alpha=0.05 ... )
statscondCluster(data, freqs_mean, ch_con_freq, tail, n_permutations, alpha=0.05)
Perform cluster-level statistical permutation test on neurophysiological data.
This function applies a cluster-based permutation test to identify significant differences between conditions, correcting for multiple comparisons by taking into account connectivity across space and frequencies.
Parameters
data : list List of arrays containing values from different conditions or groups to compare. Each array has shape (n_observations, n_features)
list
Frequencies in the frequency-band-of-interest
scipy.sparse.csr_matrix
Connectivity or metaconnectivity matrix defining adjacency across space and frequencies, typically from con_matrix() or metaconn_matrix()
int
Direction of the test: - 0: two-tailed test - 1: one-tailed test (greater) - -1: one-tailed test (less)
int
Number of permutations for the statistical test, e.g., 50000
float, optional
Significance threshold for clusters (default=0.05)
Returns
statscondClusterTuple : namedtuple A named tuple containing: - F_obs: Observed F-statistic for all variables, array of shape (n_features) - clusters: Boolean array indicating locations in significant clusters - cluster_p_values: p-value for each identified cluster - H0: Max cluster-level statistics under permutation, array of shape (n_permutations) - F_obs_plot: F-values for significant sensors, array of shape (n_features)
Notes
This function uses MNE's permutation_cluster_test to perform the analysis.
With t_power=1 (default), each location is weighted by its statistical score within a cluster, which gives more weight to stronger effects.
The function automatically calculates appropriate thresholds for cluster formation based on the F-distribution for two-tailed tests.
Examples
Compare alpha power between two conditions
cluster_stats = statscondCluster( ... [condition1_data, condition2_data], ... freqs_mean=np.arange(8, 13), ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )
Get significant clusters
significant_clusters = [i for i, p in enumerate(cluster_stats.cluster_p_values) if p <= 0.05] print(f"Found {len(significant_clusters)} significant clusters")