Focus
Machine Learning, Neurology, Biomedical Data Science
Motivation
Early Detection, Accessibility, Clinical Interpretability
About the project
This research investigates how machine learning can support the early, non-invasive detection of Parkinson’s disease (PD) — a neurodegenerative condition often diagnosed only after substantial neuronal loss. The study compares two interpretable models, Logistic Regression (LR) and Random Forest (RF), applied separately to three publicly available datasets: vocal tremor recordings (UCI Parkinson’s dataset), REM sleep parameters (PhysioNet Sleep-EDF), and smartphone-based movement data (mPower). By evaluating these models through stratified cross-validation, the paper tests their diagnostic ability across different modalities.
The findings indicate that Random Forest consistently outperforms Logistic Regression, achieving ROC-AUC scores of 0.96 on vocal data, 0.92 on sleep data, and 0.96 on movement data. These results demonstrate the feasibility of early PD detection using simple, interpretable models on individual data modalities. However, since the datasets originate from distinct participant cohorts, any multimodal conclusions are treated as synthetic cross-dataset simulations rather than true integrated analyses. The study emphasizes that real-world multimodal validation requires synchronised, participant-level data collection.
Beyond model performance, the paper underscores the ethical and practical importance of interpretability in medical AI — particularly when working with small datasets and sensitive health decisions. The author argues that while deep learning can achieve superior accuracy, it often sacrifices transparency, which limits clinical adoption. By contrast, LR and RF provide clearer decision boundaries and feature-level insights that clinicians can trust. The study concludes that interpretable models, applied to accessible and non-invasive biomarkers such as voice, movement, and sleep, can play a pivotal role in developing scalable, early-screening tools for Parkinson’s disease, especially in regions with limited access to advanced diagnostic imaging like DaTscan.
Check out more projects




