March 7, 2025 Dr. Rohan Mehta

Riemannian Geometry for Motor-Imagery Classification: A Practical Introduction

By Dr. Rohan Mehta, Founder & CEO · Synaptiq

When I first encountered the Riemannian geometry approach to BCI classification — through Barachant and colleagues' work on covariance matrix classifiers — my initial reaction was the same one most signal processing engineers have: this seems like significant mathematical overhead for a problem we already have workable solutions to. Common spatial patterns plus LDA has been the field's workhorse for years. Why go to the trouble of geodesics and SPD manifolds?

After building and deploying motor-imagery decoders across multiple sessions and multiple hardware configurations, the answer became clear: LDA on CSP features works beautifully when your test data resembles your training data. The moment your sessions introduce non-stationarity — electrode drift, participant fatigue, neuroplastic change — Euclidean classifiers degrade in ways that Riemannian classifiers do not. The mathematical overhead is not overhead. It is the mechanism that makes the classifier hold up under non-stationarity.

What Is Wrong with Euclidean Feature Space

Standard motor-imagery classification pipelines extract features in Euclidean space. After bandpass filtering and CSP spatial filtering, you compute log-variance or power spectral density estimates, concatenate them into a feature vector, and train a linear discriminant analysis (LDA) or SVM classifier. The feature vector lives in Rⁿ, distances are computed with the standard Euclidean metric, and the decision boundary is a hyperplane.

The problem: EEG covariance matrices — which are the foundation of CSP and of any spatial filtering approach — are symmetric positive definite (SPD) matrices. They do not live in Euclidean space. They live on a curved Riemannian manifold: the set of all d×d real symmetric matrices with strictly positive eigenvalues, where d is the number of EEG channels.

When you take the arithmetic mean of two SPD matrices C₁ and C₂ in Euclidean space — i.e., (C₁ + C₂) / 2 — you get a matrix that is SPD. So far so good. But the arithmetic mean does not respect the geometry of the SPD manifold. The "midpoint" in Euclidean space is not the midpoint along the geodesic (shortest path on the manifold surface) between C₁ and C₂. Operations based on Euclidean arithmetic systematically distort the structure of covariance data.

This distortion is tolerable when the data is stationary — when sessions are short, electrodes do not move, and the participant's neural state is consistent. Under real clinical conditions, it is not tolerable. The Euclidean mean of a session's covariance matrices can shift substantially between sessions simply because electrode placement has changed by a few millimeters, altering the spatial mixing of neural sources onto electrode channels.

SPD Manifolds and the Riemannian Metric

The SPD manifold, denoted Sym⁺(d), has a natural Riemannian metric called the affine-invariant metric. For two SPD matrices P and Q, their geodesic distance is:

δ_R(P, Q) = ‖ log(P^{-1/2} Q P^{-1/2}) ‖_F

where log is the matrix logarithm and ‖·‖_F is the Frobenius norm. This metric has a crucial property: it is invariant under congruence transformations. If you apply any invertible matrix A to your data (which is what happens when electrode impedances change and re-scale individual channels), the geodesic distances between covariance matrices are preserved. This is exactly the invariance property you want in a classifier that must stay reliable under electrode configuration changes.

The Fréchet mean (also called the Riemannian mean or geometric mean) of a set of SPD matrices {C₁, ..., C_N} is the point M on the manifold that minimizes the sum of squared geodesic distances:

M = argmin_{X ∈ Sym+(d)} Σ_i δ_R(X, C_i)^2

There is no closed-form solution for the general case, but the iterative algorithm converges reliably — typically within 10–30 iterations for EEG-dimensional matrices. In practice, using the pyriemann library, computing the Riemannian mean of a set of 32-channel EEG covariance matrices takes single-digit milliseconds, well within real-time processing budgets.

Classification on the Manifold

The most straightforward Riemannian classifier for motor imagery is Minimum Distance to Riemannian Mean (MDM). During calibration, you compute one Riemannian mean per class — M_left and M_right for a two-class left/right hand imagery task. During online decoding, a new covariance matrix C is classified by computing its geodesic distance to each class mean and assigning it to the nearest:

class(C) = argmin_{k} δ_R(C, M_k)

This is conceptually analogous to nearest-centroid classification in Euclidean space, but the distance metric and mean computation respect the manifold geometry. The result is a classifier that is naturally resistant to outliers and to smooth distributional shifts — because geodesic distances are less sensitive to the scaling and rotation effects that dominate inter-session covariance changes.

An extension of MDM maps all covariance matrices to the tangent space at the Riemannian mean, transforming them into symmetric matrices in Euclidean space, where any standard Euclidean classifier (LDA, SVM, logistic regression) can then be applied. This tangent space projection — implemented in pyriemann as TangentSpace — gives access to the full arsenal of linear classifiers while operating in a space where the geometry is locally consistent with the data structure. This is the approach we use in the Synaptiq pipeline for its combination of classification flexibility and computational efficiency.

Why Session-to-Session Robustness Improves

The core intuition for why Riemannian classifiers handle inter-session non-stationarity better than Euclidean ones comes from the affine invariance of the Riemannian metric.

Consider what happens to EEG covariance matrices when electrode placement shifts between sessions. In a simplified model, if session 1 data has covariance structure C and session 2 has a slightly different electrode configuration representable as a linear mixing change A, then the covariance matrices in session 2 are approximately ACA^T. In Euclidean space, the distance between a session-2 covariance and the session-1 class mean is dominated by the transformation A, not by the motor-imagery content encoded in C.

Under the Riemannian metric with its affine invariance, the geometric relationship between covariance matrices within a session is preserved across the transformation A — to first order. The relative positions of left-hand and right-hand imagery covariances on the manifold are more stable than their absolute Euclidean positions. This is why MDM and tangent space classifiers maintain higher classification accuracy across sessions without explicit re-calibration, compared to LDA on CSP log-variance features.

We are not claiming that Riemannian methods make calibration unnecessary. Session-to-session accuracy will still degrade under large distributional shifts — neuroplastic changes over weeks of therapy, major electrode position changes, or significant changes in participant arousal state. What Riemannian geometry provides is a larger envelope of tolerable shift before accuracy falls below acceptable thresholds. In clinical practice, this translates to fewer required recalibration sessions and more consistent decode performance through a session as electrodes drift.

Practical Implementation Notes

Working with Riemannian geometry in Python, the pyriemann library is the essential tool. It integrates cleanly with scikit-learn's pipeline API, which matters for building reproducible preprocessing chains. A typical classification pipeline looks like:

from pyriemann.estimation import Covariances
from pyriemann.classification import MDM
from sklearn.pipeline import make_pipeline

pipeline = make_pipeline(
    Covariances(estimator='lwf'),  # Ledoit-Wolf regularized covariance
    MDM(metric='riemann')
)

pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)

The choice of covariance estimator matters. For EEG data where the number of time samples per trial is small relative to the number of channels — a common situation with short calibration recordings — the sample covariance matrix is poorly conditioned and its inverse is unreliable. Regularized estimators such as Ledoit-Wolf (lwf) or Tikhonov regularization (oracle approximating shrinkage) improve conditioning at the cost of some bias. In practice, Ledoit-Wolf performs well across most EEG channel counts and trial lengths encountered in clinical BCI.

For channel counts above ~64, computation of the Riemannian mean can become expensive if not parallelized. In the 16–32 channel range typical of rehabilitation EEG setups, this is not a practical constraint. A 32-channel Riemannian mean computation on modern hardware completes in under 5ms, leaving ample margin in a 100ms end-to-end latency budget.

The moabb (Mother of All BCI Benchmarks) repository provides standardized evaluation protocols for comparing Riemannian and Euclidean classifiers across public EEG datasets. For teams building clinical BCI software, it is worth running systematic benchmarks on MOABB before committing to a classification approach — the performance differences between methods are dataset-dependent, and Riemannian methods do not universally win on every task. Their advantage is most pronounced exactly where it matters for clinical deployment: in cross-session and cross-subject evaluation conditions, rather than within-session held-out test sets.

That distinction — within-session vs. cross-session evaluation — is the most important methodological choice you will make when benchmarking a motor-imagery decoder. Reporting only within-session accuracy is common in research but irrelevant for clinical systems. A decoder that achieves 90% within-session but 65% cross-session is not a clinical decoder. A decoder that achieves 82% within-session and 79% cross-session — as Riemannian approaches consistently demonstrate on open datasets — is.