RESEARCH PAPER
Attention-based Deep Feature Fusion for Automated Dysarthria Severity Classification: A Speech-based Computational Functional Marker Relevant to Neurovascular and Neurodegenerative Conditions.
Abstract
INTRODUCTION: Dysarthria is a neuromotor disorder that occurs as a clinical manifestation of neurovascular and neurodegenerative diseases, including stroke, traumatic brain injury, Parkinson's disease, and lateral sclerosis. These conditions disrupt neuronal and vascular pathways involved in motor speech control, leading to reduced speech intelligibility and communication barriers between dysarthric speakers and the public.
METHODS: Deep Learning (DL)-based approaches are investigated for automated classification of dysarthria severity. Multiple methodologies are experimented with to identify an effective configuration, including baseline Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models using cepstral features, pretrained networks for feature extraction, and hybrid approaches that combine deep features with Support Vector Machine (SVM) classifiers. An attention-based fusion strategy is further employed on the two best-performing pretrained models to further enhance performance.
RESULTS: Attention-based feature fusion framework achieves accuracies of 97.90% ± 0.47 on the TORGO database and 95.31% ± 0.34 on the UA-Speech corpus under utterance-level evaluation, and 62.77% ± 2.38 and 56.26% ± 3.24, respectively, under speaker-independent evaluation.
DISCUSSION: Experimental results indicate that the proposed automated framework consistently outperforms baseline approaches, providing a more robust objective metric for assessing dysarthric speech.
CONCLUSION: Various experiments demonstrate that using pretrained networks as feature extractors, combined with feature engineering, optimally enhances the performance of dysarthria severity classification. The proposed framework serves as a speech-based functional computational proxy for dysarthria severity assessment, enabling objective analysis of motor speech impairment.