Midterm Report

Machine Learning for Seizure Onset Zone (SOZ) Detection from sEEG

Literature Review

Epilepsy affects over 50 million people worldwide, and about 30% of patients suffer from drug-resistant epilepsy, for whom surgery is often the only curative option. The success of surgical intervention depends critically on the accurate identification of the epileptogenic zone (EZ), specifically the Seizure Onset Zone (SOZ), the region where seizures originate.

Stereo-electroencephalography (sEEG), which involves intracranial recordings from depth electrodes, has become a gold standard for delineating the SOZ due to its ability to capture high-resolution spatiotemporal activity.

Key Biomarkers for SOZ Detection

Traditional approaches rely heavily on expert visual review, which is time-consuming, subjective, and prone to inter-rater variability. Machine learning methods including CNNs, RNNs, GNNs, and Transformers have shown promise in automating this process.

Dataset Description

We employ multi-center datasets to ensure variability in acquisition protocols, electrode configurations, and patient demographics, enabling assessment of cross-center generalization.

HUP iEEG Dataset

De-identified patient data (n=58) containing electrophysiologic data for interictal and ictal periods from the Hospital of the University of Pennsylvania.

View Dataset
Epilepsy iEEG Interictal Multicenter Dataset

De-identified patient data (n=39) with sleep/wake annotations from multiple centers including NIH, Johns Hopkins, and University of Miami.

View Dataset

Problem Definition and Motivation

For patients with drug-resistant focal epilepsy, surgical resection of the SOZ can be curative. However, the main clinical challenge lies in accurately identifying the SOZ from sEEG recordings. Mislocalization leads to failed surgeries and persistent seizures.

Current manual approaches are inefficient and limited by subjective interpretation. Our motivation is to develop machine learning models that can automatically detect the SOZ with high accuracy, robustness across datasets, and interpretability aligned with known biomarkers.

The project's impact extends beyond accuracy: automating SOZ detection can reduce diagnostic time, improve consistency, and potentially expand surgical candidacy by making evaluation more scalable.

Methods

This section outlines the preprocessing of sEEG signals and the model design for automatic SOZ detection.

Preprocessing

The sEEG signals at each electrode are processed by a Daubechies-2 (Db2) Wavelet Packet Transform (WPT). We use five levels of decomposition and this provides localized time–frequency features, useful for detecting spikes, ripples, and HFOs. The compact support of the Db2 wavelet allows sharp temporal resolution for transient spikes (i.e., sudden high-amplitude discharges), while its regularity suppresses smoother background activity. This granularity disentangles overlapping spectral components, allowing downstream models to identify ripples as sustained energy in specific sub-bands, and spikes as transient bursts in adjacent bands.

Implementation

A single-level Db2 DWT is implemented with torch.nn.functional.conv1d using analytical Db2 low-/high-pass filters. A recursive _wpt_recursive function performs full packet decomposition on both branches up to level L = 5, yielding 2^L leaf sub-bands per channel. Leaf outputs are concatenated to form the final WPT feature map.

Preprocessing architecture pipeline
Figure 1: Preprocessing pipeline for WPT-based feature extraction (Db2, 5 levels).
Filtering

Construct Db2 low/high filters from closed-form coefficients.

Single-Level Decomposition

Convolve and downsample by two to obtain approximation and detail coefficients.

Recursive Expansion

Apply packet decomposition on both branches at each level (L = 5).

Feature Concatenation

Concatenate all 2^5 leaves into a compact per-channel WPT representation.

Model: Transformer-Based SOZ Detection with Contrastive Pre-Training

Design

We use a Transformer encoder that treats sEEG channels as an unordered set, enabling generalization across patients with varying channel counts and layouts. The model classifies each channel within a 10 s window from a sequence of 10 clips as SOZ vs. non-SOZ.

The model architecture consists of 5 main blocks:
tokenizer: passes each sample through WPT and an MLP to produce "tokens"
temporal encoder: takes the temporal tokens
for each channel and passes them through a transformer encoder layer
spatial preencoder: concatenates the temporal encodings and passes them through an MLP for a lower_dimensional embedding for each channel
spatial encoder: passes the channel embeddings through a transformer encoder layer
projection head: passes the spatial encodings through an MLP to get the output logits

The Transformer architecture takes a 4d input with the following dimensions: Batch Size*Sequence Length*Num Channels*Num_Clips. Since the number of channels varies, we assume a max channel size and implement padding to ensure dimensions are consistent. So that the model knows what data is real and what is padding, a padding mask is generated and passed to the model along with the data. The padding mask is passed to each block so that they only perform computations for real data. The recording is also variable. However, instead of padding, we split the recordings into equal length sequences of 10 clips that are 10 seconds each. Any data that remains is discarded.

Input Tokenization

Training Strategy

Classification Fine-Tuning

Remove the projection head; add a linear + sigmoid classifier and optimize with focal class-balanced BCE.

Outputs & Interpretation

Results and Evaluation

This section will report quantitative performance on held-out patients and cross-center generalization. Once results are available, populate the summary cards and tables below. Suggested protocol and placeholders are provided for consistency and reproducibility.

Evaluation Protocol

Our current best-performing model achieves a validation AUROC of approximately 0.70 at epoch 10. When evaluated on the held-out test set using the checkpoint from this best epoch, performance is reduced to ~0.63 AUROC, which indicates moderate generalization. Notably, the validation loss begins to increase after the early training epochs, while the validation ROC continues to improve. This is expected behavior: the loss penalizes the model for being confidently wrong, whereas the ROC measures only how well the model separates SOZ vs. non-SOZ channels. As training progresses, the model becomes more confident in its predictions, which increases loss when errors occur, even though the underlying class separation improves. In other words, the model’s probability calibration overfits before its ability to rank SOZ vs. non-SOZ channels does. Since ROC directly reflects separation ability — the priority in our task — the improving validation ROC is the more meaningful indicator of progress.

Validation ROC at Epoch 1 (AUC ≈ 0.54).
Validation ROC at best-performing epoch (AUC ≈ 0.70).
Test ROC using selected model checkpoint (AUC ≈ 0.63).

Headline Metrics

0.6259
AUROC
0.1681
AUPRC
0.6904
Recall (SOZ)
0.2579
F1 (SOZ)