About CURATOR

Research Objectives

Current clinical protocols are typically standardized rather than personalized, which means they fail to account for individual differences in brain dynamics, cognitive goals, and lifestyle factors. This lack of personalization makes treatment optimization slow, inefficient, and reliant on trial-and-error approaches.

The key scientific contribution of CURATOR lies in validating biomarkers as predictors of treatment response, systematically mapping modality-outcome relationships, and assessing whether multimodal AI models can outperform current heuristic approaches.

Research Hypotheses

Specific EEG biomarkers (e.g., alpha peak frequency, connectivity measures) can predict treatment response with accuracy significantly above chance level.
Feedback modality (visual, auditory, interactive) significantly moderates treatment outcomes, and individual-level modality matching is preferred.
Multimodal integration via modern LLMs yields more accurate and interpretable treatment recommendations than current clinician-only heuristics.

Methodology

The project combines rigorous analysis and multimodal AI integration to (1) test our research hypotheses and (2) deliver a clinically useful tool. Each methodological step is directly tied to one or more hypotheses and is designed to be feasible within the fellowship timeframe while producing generalizable scientific insights.

Data preprocessing. EEG signals will be preprocessed with bandpass filtering, notch filtering, bad-channel detection, ICA or automated artifact subspace separation for ocular/muscle artifacts, and robust normalization across sessions. This approach follows best practice recommendations for reproducibility in EEG pipelines [Bigdely-Shamlo et al., 2015; Gabard-Durnam et al., 2018]. Preprocessing will include pseudonymization of raw files and secure storage, ensuring GDPR compliance from the earliest stage of the pipeline.

Feature extraction. We will extract conventional spectral biomarkers (absolute and relative power in canonical EEG bands), peak alpha frequency and band ratios (e.g., theta/alpha), and metrics of spectral variability (e.g., coefficient of variation across epochs). For task-based protocols (e.g. oddball and attention tasks) we will compute ERP amplitudes/latencies, including P300 [Arvaneh et al., 2019]. Connectivity measures (e.g., coherence, phase-locking value) and graph-theoretic summaries (clustering, modularity, dynamic reconfiguration) will also be derived.

Candidate modalities and paradigms. Prior studies show modality effects vary by task and population [Sigrist et al., 2013; Proulx et al., 2022]. We will assess visual (abstract gauges, dynamic scenes, game visuals), auditory (sonification, tonal reinforcement), and interactive (simple games where EEG control modifies gameplay) feedback modalities. Evaluation will proceed in two phases: (1) single-session laboratory learning to capture immediate engagement and (2) multi-session pilots to assess sustained efficacy.

Engagement and learning metrics. For each modality we will measure objective neurofeedback learning (change in target biomarker per unit time), behavioral indices (task performance, if applicable), physiological proxies (heart rate variability and pupilometry, where available), and subjective usability (standardized questionnaires). These metrics form the basis for per-individual modality matching rules.

Machine Learning models for biomarker selection. We will benchmark compact convolutional models such as EEGNet [Lawhern et al., 2018] and Deep4Net [Schirrmeister et al., 2017] for EEG decoding. EEGNet is particularly suited for low-latency inference [Bian et al., 2024]. Candidate models will be compared against simpler baselines (e.g., linear/logistic regression on spectral features) to ensure gains are scientifically meaningful.

Multimodal fusion. Structured inputs (EEG biomarkers, questionnaire scores, cognitive test scores) will be fused either via late-fusion (model outputs combined via meta-learner) or via early-fusion methods with deep models (concatenated embeddings), depending on the available sample sizes. Unstructured clinical texts (patient history) will be encoded using pretrained LLMs, without needing extensive feature engineering, and further processed via retrieval-augmented generation (RAG) to enforce grounding in verifiable evidence.

Model evaluation. Performance will be evaluated via nested cross-validation and temporal holdouts. Metrics will include AUC/ROC analysis for binary classification, mean absolute error for continuous outcomes, and calibration curves for clinical interpretability. We estimate that, in our setting, 20 patients (10 sessions each) is sufficient to demonstrate feasibility and provide proof-of-concept evidence for larger-scale follow-ups.

Explainability and clinical audit. All deployed models will include explainability outputs (feature-level attributions, confidence intervals, and out-of-distribution alarms). SHAP [Lundberg & Lee, 2017] or integrated gradients [Sundararajan et al., 2017] will be used for auditing our trained models.

LLM choices. For production-grade summarization we will test proprietary (e.g., OpenAI's GPT-4o, Anthropic's Claude) and open-source (Meta's LLaMA-4 variants) LLMs. All outputs will be verified by clinicians (human-in-the-loop), with a fallback to rule-based templates if LLM outputs fail audit checks.

Collaboration

CURATOR is a collaboration between the University of Luxembourg and Neurofeedback Luxembourg (Servicium SA), combining advanced research capabilities in HCI and Machine Learning with direct clinical expertise and access to patient populations.

Research Milestones

Work Packages

Work Package· Sept 2026in progress

WP1: Design of a Standard Protocol of EEG Biomarkers

Establish and validate a standardized biomarker framework (JSON-based, BIDS-compliant) that encodes EEG features and generates empirical evidence of their predictive value for neurofeedback outcomes. Duration: M1–M8 (Feb–Sep 2026). Effort: 11.7 person-months.

Progress10%

Work Package· Jun 2027pending

WP2: Multifactorial Personalized Reporting

Develop a reporting system that integrates multimodal patient data (EEG, questionnaires, cognitive tests, histories) and systematically compares LLM-based and rule-based approaches to generate clinically interpretable, auditable treatment recommendations. Duration: M7–M18 (Aug 2026–Jun 2027). Effort: 11.1 person-months.

Work Package· Nov 2027pending

WP3: Evaluation

Evaluate the CURATOR framework through pilot studies with 20+ patients, assessing biomarker prediction accuracy, modality matching effectiveness, and clinical utility of the personalised reporting system. Duration: M13–M22 (Feb–Nov 2027). Effort: 9.5 person-months.

Work Package· Jan 2028in progress

WP4: Administration and Dissemination

Project management, ethics compliance, IP management, and dissemination of results through publications, conferences, and the project website. Duration: M1–M24 (Feb 2026–Jan 2028). Effort: 2.2 person-months.

Progress5%

Deliverables

Deliverable· Jun 2026pending

D1.1: Specification of Biomarkers

Definition and evaluation of a set of clinically and computationally relevant EEG biomarkers, including the JSON-based schema for interoperability. Due: M4 (June 2026).

Deliverable· Sept 2026pending

D1.2: Protocol Implementation

Full implementation of the standardised biomarker protocol with unit tests and validation against public EEG datasets. Due: M8 (September 2026).

Deliverable· Jun 2027pending

D2.1: Prototype of Report Generator

Working prototype of the multimodal report generator integrating EEG biomarkers, patient data, and LLM-based recommendations with clinician review interface. Due: M17 (June 2027).

Deliverable· Jan 2028pending

D3.1: Paper or Technical Report

Scientific publication or technical report documenting the CURATOR framework, pilot study results, and evaluation outcomes. Due: M24 (January 2028).

Other

review· Sept 2026pending

MS1: Protocol Delivered and Unit-Tested

Key milestone: biomarker protocol fully implemented, validated against public datasets, and documented. Go/no-go decision for WP2. Due: M8 (September 2026).

review· Jun 2027pending

MS2: Clinician Feedback Integration

Report generator prototype reviewed by clinical team. Clinician feedback incorporated into the personalisation engine. Due: M17 (June 2027).

review· Nov 2027pending

MS3: Pilot Studies Completed

All pilot studies (20+ patients, 10 sessions each) completed. Data collected for final evaluation and publication. Due: M22 (November 2027).