Statistics

Statistical functions

Option	Description
title	stats.py
authors	Florence Brun, Guillaume Dumas
date	2020-03-18

`con_matrix(epochs, freqs_mean, draw=False)`

Compute a priori channel connectivity across space and frequencies.

This function creates connectivity matrices that define which channels and frequencies should be considered neighbors for cluster-based statistics.

Parameters

epochs : mne.Epochs Epochs object containing channel information

List[float]

List of frequencies in the frequency-band-of-interest used for power or coherence spectral density calculation

bool, optional

Whether to plot the connectivity matrices (default=False)

Returns

con_matrixTuple : namedtuple A named tuple containing: - ch_con: Connectivity matrix between channels in space, scipy.sparse.csr_matrix of shape (n_channels, n_channels) - ch_con_freq: Connectivity matrix between channels across space and frequencies, scipy.sparse.csr_matrix of shape (n_channelslen(freqs_mean), n_channelslen(freqs_mean))

Notes

The channel connectivity matrix (ch_con) is based on the spatial adjacency of EEG electrodes - channels that are physically adjacent are considered connected.

The frequency-space connectivity matrix (ch_con_freq) extends this spatial adjacency to include frequency adjacency - neighboring frequencies for the same channel are also considered connected.

These connectivity matrices are used as inputs to cluster-based statistical functions to define the neighborhood structure for clustering.

Examples

Create connectivity matrices for alpha band frequencies

alpha_freqs = np.arange(8, 13) conn = con_matrix(epochs, alpha_freqs, draw=True) ch_con = conn.ch_con # Channel spatial connectivity ch_con_freq = conn.ch_con_freq # Channel-frequency connectivity

`metaconn_matrix(electrodes, ch_con, freqs_mean)`

Compute a priori connectivity between pairs of sensors within one brain.

This function creates connectivity matrices for pairs of channels within a single brain, taking into account spatial adjacency for cluster-based statistics.

Parameters

electrodes : List[Tuple[int, int]] List of electrode pairs for which connectivity indices have been computed. Each tuple contains the indices of two channels from the same participant.

scipy.sparse.csr_matrix

Connectivity matrix between channels in space, typically from con_matrix()

List[float]

List of frequencies in the frequency-band-of-interest

Returns

metaconn_matrixTuple : namedtuple A named tuple containing: - metaconn: Connectivity matrix between channel pairs, array of shape (len(electrodes), len(electrodes)) - metaconn_freq: Connectivity matrix between channel pairs across space and frequencies, array of shape (len(electrodes)len(freqs_mean), len(electrodes)len(freqs_mean))

Notes

This function determines whether two channel pairs are connected based on the spatial adjacency of their constituent channels. It considers various combinations of adjacency between the channels.

The resulting connectivity matrices define the neighborhood structure for cluster-based statistics on connectivity data within a single brain.

Examples

Create metaconnectivity matrices for intrabrain connectivity

electrode_pairs = indices_connectivity_intrabrain(epochs) metaconn = metaconn_matrix( ... electrode_pairs, ch_con.ch_con, freqs_mean=[10] ... )

`metaconn_matrix_2brains(electrodes, ch_con, freqs_mean, plot=False)`

Compute a priori connectivity matrices for hyperscanning analyses.

This function creates connectivity matrices for pairs of channels across two brains (participants), taking into account spatial adjacency within each brain but assuming no direct connectivity between brains.

Parameters

electrodes : List[Tuple[int, int]] List of electrode pairs for which connectivity indices have been computed. Each tuple contains the indices of channels from participant 1 and participant 2.

scipy.sparse.csr_matrix

Connectivity matrix between channels in space, typically from con_matrix()

List[float]

List of frequencies in the frequency-band-of-interest

bool, optional

Whether to plot the connectivity matrices (default=False)

Returns

metaconn_matrix_2brainsTuple : namedtuple A named tuple containing: - metaconn: Connectivity matrix between channel pairs, array of shape (len(electrodes), len(electrodes)) - metaconn_freq: Connectivity matrix between channel pairs across space and frequencies, array of shape (len(electrodes)len(freqs_mean), len(electrodes)len(freqs_mean))

Notes

This function assumes there is no a priori connectivity between channels from different participants. It considers two channel pairs connected if: 1. The respective channels within each participant are connected, or 2. Some channels are identical across the pairs

The resulting connectivity matrices define the neighborhood structure for cluster-based statistics on hyperscanning data.

Examples

Create metaconnectivity matrices for interbrain connectivity

electrode_pairs = indices_connectivity_interbrain(epochs_hyper) metaconn = metaconn_matrix_2brains( ... electrode_pairs, ch_con.ch_con, freqs_mean=[10], plot=True ... )

`statsCond(data, epochs, n_permutations, alpha)`

Perform statistical t-test on participant measures (e.g., PSD) for a condition.

This function tests whether the observed mean significantly deviates from 0 using a permutation-based t-test with False Discovery Rate (FDR) correction for multiple comparisons.

Parameters

data : np.ndarray Array of participant measures with shape (n_samples, n_tests, n_freq) where n_tests typically represents channels and n_freq the frequencies. Values will be averaged across frequencies for statistics.

mne.Epochs

Epochs object for a condition from a random participant, used only to access information like channel positions.

int

Number of permutations for the statistical test Should be at least 2*n_samples, e.g., 50000

float

Significance threshold for the test, e.g., 0.05

Returns

statsCondTuple : namedtuple A named tuple containing: - T_obs: T-statistic observed for all variables, array of shape (n_tests) - p_values: p-values for all tests, array of shape (n_tests) - H0: T-statistics obtained by permutations, array of shape (n_permutations) - adj_p: tuple with boolean assessment of significance and FDR-corrected p-values - T_obs_plot: statistical values for significant sensors, array of shape (n_tests)

Notes

This test calculates if the observed mean significantly deviates from 0; it doesn't compare two periods, but tests one period against the null hypothesis.

Randomized data are generated with random sign flips, and the test is two-tailed by default (the alternative hypothesis is that the data mean is different from 0).

To reduce false positives due to multiple comparisons, False Discovery Rate (FDR) correction is applied to the p-values.

The frequency dimension is reduced to one for the test (average in the frequency band-of-interest). To take frequencies into account, use cluster statistics (see statscondCluster function).

Examples

Independent t-test between two groups

ind_ttest_results = statscluster( ... [group1_data, group2_data], ... test='ind ttest', ... factor_levels=None, ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )

2×2 repeated measures ANOVA (within-subjects design)

Factor 1: Condition (2 levels), Factor 2: Time (2 levels)

anova_results = statscluster( ... data_array_2x2, # Shape: (4, n_subjects, n_features) ... test='f multipleway', ... factor_levels=[2, 2], ... ch_con_freq=connectivity.ch_con_freq, ... tail=1, # F-tests are one-tailed ... n_permutations=10000, ... alpha=0.05 ... )

`statscluster(data, test, factor_levels, ch_con_freq, tail, n_permutations, alpha=0.05)`

Perform cluster-based statistical tests with various test statistics.

This function provides a flexible interface to cluster-based permutation tests, supporting different statistical tests including t-tests, one-way ANOVA, and multiple-way ANOVA for complex experimental designs.

Parameters

data : list or np.ndarray For t-tests and one-way ANOVA: list of arrays containing values from different groups or conditions to compare For multiple-way ANOVA: numpy array organized by factors

str

Type of statistical test to use: - 'ind ttest': Independent samples t-test - 'rel ttest': Related (paired) samples t-test - 'f oneway': One-way ANOVA - 'f multipleway': Multiple-way ANOVA

List[int] or None

For multiple-way ANOVA, list specifying the number of levels for each factor (e.g., [2, 3] for a 2×3 design with 2 levels of factor 1 and 3 levels of factor 2) Set to None for other tests

scipy.sparse.csr_matrix

Connectivity or metaconnectivity matrix defining adjacency across space and frequencies

int

Direction of the test: - 0: two-tailed test (must be used for f oneway) - 1: one-tailed test (greater) (must be used for f multipleway) - -1: one-tailed test (less)

int

Number of permutations for the statistical test, e.g., 50000

float, optional

Significance threshold for clusters (default=0.05)

Returns

statscondClusterTuple : namedtuple A named tuple containing: - Stat_obs: Observed statistic (T or F) for all variables - clusters: Boolean array indicating locations in significant clusters - cluster_p_values: p-value for each identified cluster - H0: Max cluster-level statistics under permutation - Stat_obs_plot: Statistical values for significant sensors

Notes

For multiple-way ANOVA with connectivity values, the last dimensions may need to be flattened from shape (n_sensors, n_sensors) to a vector using np.reshape.

The function applies different thresholding approaches based on the test type: - For t-tests: Uses the alpha value directly - For one-way ANOVA: Uses the alpha value directly - For multiple-way ANOVA: Calculates appropriate F-thresholds based on factor levels

With t_power=1, each location is weighted by its statistical score within a cluster.

Examples

Independent t-test between two groups

ind_ttest_results = statscluster( ... [group1_data, group2_data], ... test='ind ttest', ... factor_levels=None, ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )

2×2 repeated measures ANOVA (within-subjects design)

Factor 1: Condition (2 levels), Factor 2: Time (2 levels)

anova_results = statscluster( ... data_array_2x2, # Shape: (4, n_subjects, n_features) ... test='f multipleway', ... factor_levels=[2, 2], ... ch_con_freq=connectivity.ch_con_freq, ... tail=1, # F-tests are one-tailed ... n_permutations=10000, ... alpha=0.05 ... )

`statscondCluster(data, freqs_mean, ch_con_freq, tail, n_permutations, alpha=0.05)`

Perform cluster-level statistical permutation test on neurophysiological data.

This function applies a cluster-based permutation test to identify significant differences between conditions, correcting for multiple comparisons by taking into account connectivity across space and frequencies.

Parameters

data : list List of arrays containing values from different conditions or groups to compare. Each array has shape (n_observations, n_features)

list

Frequencies in the frequency-band-of-interest

scipy.sparse.csr_matrix

Connectivity or metaconnectivity matrix defining adjacency across space and frequencies, typically from con_matrix() or metaconn_matrix()

int

Direction of the test: - 0: two-tailed test - 1: one-tailed test (greater) - -1: one-tailed test (less)

int

Number of permutations for the statistical test, e.g., 50000

float, optional

Significance threshold for clusters (default=0.05)

Returns

statscondClusterTuple : namedtuple A named tuple containing: - F_obs: Observed F-statistic for all variables, array of shape (n_features) - clusters: Boolean array indicating locations in significant clusters - cluster_p_values: p-value for each identified cluster - H0: Max cluster-level statistics under permutation, array of shape (n_permutations) - F_obs_plot: F-values for significant sensors, array of shape (n_features)

Notes

This function uses MNE's permutation_cluster_test to perform the analysis.

With t_power=1 (default), each location is weighted by its statistical score within a cluster, which gives more weight to stronger effects.

The function automatically calculates appropriate thresholds for cluster formation based on the F-distribution for two-tailed tests.

Examples

Compare alpha power between two conditions

cluster_stats = statscondCluster( ... [condition1_data, condition2_data], ... freqs_mean=np.arange(8, 13), ... ch_con_freq=connectivity.ch_con_freq, ... tail=0, # two-tailed test ... n_permutations=10000, ... alpha=0.05 ... )

Get significant clusters

significant_clusters = [i for i, p in enumerate(cluster_stats.cluster_p_values) if p <= 0.05] print(f"Found {len(significant_clusters)} significant clusters")