April 22, 2025 Dr. Rohan Mehta

Common Spatial Pattern Filter Banks: What They Do and When They Break

By Dr. Rohan Mehta, Founder & CEO · Synaptiq

Common Spatial Patterns (CSP) is one of those methods in BCI signal processing that occupies an interesting position: nearly everyone uses it, nearly everyone understands its basic operation, and yet the conditions under which it fails — and what to do about those failures — are less consistently understood. The filter bank extension of CSP (FBCSP) is the most widely adopted solution to one of CSP's core limitations, but it introduces its own set of engineering trade-offs that matter in clinical deployment.

I want to work through both: the mechanics of CSP itself, the specific failure mode that FBCSP addresses, and the failure modes that FBCSP introduces. Because understanding where an algorithm breaks is as important as understanding where it works — perhaps more so when you are shipping a decoder into a clinical environment where the consequences of a degraded classification are a person's rehabilitation therapy not functioning as intended.

What CSP Actually Does

CSP is a generalized eigenvalue decomposition method. Given two covariance matrices estimated from two classes of EEG data — say, C_L for left-hand motor imagery and C_R for right-hand motor imagery — CSP finds a set of spatial filters W such that the variance of the spatially filtered signal is maximized for one class and simultaneously minimized for the other.

Formally, CSP solves the generalized eigenvalue problem:

C_L w = λ C_R w

The eigenvectors w that correspond to the largest eigenvalues λ are the filters that maximize variance for class L relative to class R. The eigenvectors corresponding to the smallest eigenvalues are the filters that maximize variance for class R relative to class L. In practice, you select the m filters from each end of the eigenspectrum — typically m = 3 — giving you 2m spatial filters total.

After applying these filters to raw EEG data, you compute the log-variance of each filtered channel across a trial window (typically 0.5–2 seconds for motor imagery). The resulting 2m-dimensional feature vector captures the relative spectral power balance between the two classes in the spatial dimensions most discriminative for motor imagery. This vector is then fed to a standard classifier — LDA, SVM, or increasingly a Riemannian geometry approach.

Why does this work for motor imagery? Because left-hand motor imagery produces ERD (event-related desynchronization, a decrease in oscillatory power) over right sensorimotor cortex, and right-hand imagery produces ERD over left sensorimotor cortex. The spatial filters found by CSP align with these lateralized patterns — the high-variance filters for class L pick up the contralateral right hemisphere suppression, and vice versa. The method is extracting the neural signal of interest without requiring you to specify where it is — it discovers the discriminative spatial pattern from data.

The Frequency Specificity Problem

CSP as described above operates on bandpass-filtered data. The standard choice is a single broadband filter covering the mu and beta bands — typically 8–30 Hz. The problem with this choice is that motor imagery ERD does not occur uniformly across 8–30 Hz. Different sub-bands carry different amounts of discriminative information, and the optimal sub-band varies across participants and even across sessions for the same participant.

Some participants show strong mu-band (8–13 Hz) ERD with relatively weak beta (13–30 Hz) response. Others show the reverse. Some have a narrow individual alpha peak that makes a 8–13 Hz filter suboptimal — their motor-related desynchronization is centered at, say, 10–11 Hz. A fixed 8–30 Hz bandpass averages across these individual differences and suboptimal sub-bands contribute variance that dilutes the discriminative content of the features.

Blankertz and colleagues addressed this through frequency band optimization — selecting the bandpass filter that maximizes subject-specific classification performance. This is effective but requires a held-out validation set during calibration and risks overfitting to the specific session's signal characteristics.

Filter Bank CSP: The Approach and Its Trade-offs

FBCSP (Filter Bank CSP) takes a different approach: instead of selecting a single optimal frequency band, apply CSP independently within multiple overlapping sub-bands and combine the resulting features. A typical filter bank configuration might cover:

4–8 Hz (theta/low-mu range)
8–12 Hz (mu band, lower)
12–16 Hz (mu/beta transition)
16–24 Hz (beta, lower)
24–32 Hz (beta, upper)
8–32 Hz (broadband, as a baseline)

CSP is computed separately within each band, yielding a set of spatial filters per band. Log-variance features are extracted from each filter set. The full feature vector — concatenating all band-specific features — is typically high-dimensional, so a feature selection step is applied before classification: mutual information-based feature selection (MIBIF) or similar criteria reduce the feature space to the most discriminative subset.

The argument for FBCSP is that it automatically captures the discriminative frequency band for any given participant or session without requiring explicit band optimization. The participant-specific peak frequency is covered by at least one of the sub-bands regardless of where it falls.

The argument against — and this matters for clinical deployment — is exactly the high dimensionality. With 6 sub-bands × 6 CSP filters per band = 36 features before any selection, the feature space is large relative to the amount of calibration data typically available in a clinical setting. Feature selection on limited data introduces selection bias. FBCSP classifiers trained on small calibration sets (e.g., 40 trials) overfit more readily than narrow-band CSP classifiers, because the feature selection step is selecting noise features along with signal features.

When CSP (and FBCSP) Actually Break

The cleaner failure to understand is the cross-session failure mode. CSP estimates covariance matrices from calibration data collected at session start. It finds spatial filters that are optimal for the covariance structure present during calibration. When electrode placement shifts between sessions — even by a few millimeters — the spatial mixing matrix changes, and the optimal filters from session one are no longer optimal for session two.

This is the failure mode that drives us toward Riemannian geometry approaches at Synaptiq. The CSP filters are estimated in Euclidean space from sample covariance matrices, and their optimality does not transfer across sessions under covariance distributional shift. A Riemannian minimum-distance classifier operating directly on session covariance matrices, with Euclidean alignment applied at session boundaries, shows substantially better cross-session stability because it does not depend on the stationarity of spatial filter estimates.

The within-session failure mode is different: artifact contamination during the calibration window. If the 2-minute calibration recording contains EMG artifacts or large EOG events, the covariance estimates C_L and C_R are corrupted, and the CSP filters partially align with the artifact spatial pattern rather than the neural motor-imagery pattern. The resulting classifier has good performance on artifact-free data but fails when artifact patterns differ between calibration and online decode — which happens routinely in rehabilitation sessions where participant effort and fatigue levels vary.

Regularized covariance estimation — using Ledoit-Wolf regularization, minimum covariance determinant estimation, or geometric median covariance — partially mitigates this. Artifact rejection before covariance estimation is the better solution but requires reliable artifact detection, which is its own unsolved problem in online EEG processing.

Practical Choice: When to Use CSP vs. Riemannian

For a single-session or within-session controlled scenario with sufficient calibration data (100+ trials per class) and stable electrode placement, FBCSP with feature selection and LDA or SVM classification is competitive with Riemannian approaches and often simpler to tune. The MOABB benchmarks confirm this: FBCSP outperforms MDM on some datasets under within-session evaluation.

For multi-session clinical deployment — which is the relevant scenario for rehabilitation BCI — Riemannian geometry classifiers with session alignment consistently outperform CSP-based approaches, often by substantial margins. The cross-session stability advantage of Riemannian methods comes precisely from not depending on fixed spatial filter estimates that can become stale as electrode configurations change.

The practical answer for clinical BCI: use FBCSP features as inputs to a Riemannian classifier (computing covariance matrices within each filter bank sub-band and classifying in the Riemannian sense), rather than treating them as mutually exclusive approaches. This hybrid is sometimes called Filter Bank Riemannian (FBR) and combines the frequency-specificity benefits of the filter bank with the geometric reliability of Riemannian classification. It is currently one of the strongest approaches on open cross-session BCI benchmarks, and it is the architecture that the Synaptiq decode pipeline is based on.