Interpreting Pretrained Speech Models for Automatic Speech Assessment of Voice Disorders

Hok Shing Lau*, Mark Huntly, Nathon Morgan, Adesua Iyenoma, Biao Zeng, Tim Bashford

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

23 Downloads (Pure)

Abstract

Speech contains information that is clinically relevant to some diseases, which has the potential to be used for health assessment. Recent work shows an interest in applying deep learning algorithms, especially pretrained large speech models to the applications of Automatic Speech Assessment. One question that has not been explored is how these models output the results based on their inputs. In this work, we train and compare two configurations of Audio Spectrogram Transformer [1] in the context of Voice Disorder Detection and apply the attention rollout method [2] to produce model relevance maps, the computed relevance of the spectrogram regions when the model makes predictions. We use these maps to analyse how models make predictions in different conditions and to show that the spread of attention is reduced as a model is finetuned, and the model attention is concentrated on specific phoneme regions.
Original languageEnglish
Title of host publicationArtificial Intelligence in Healthcare - 1st International Conference, AIiH 2024, Proceedings
Subtitle of host publicationFirst International Conference, AIiH 2024, Swansea, UK, September 4–6, 2024, Proceedings, Part I
EditorsXianghua Xie, Gibin Powathil, Iain Styles, Marco Ceccarelli
PublisherSpringer
Pages59-72
Number of pages14
ISBN (Electronic)978-3-031-67278-1
ISBN (Print)978-3-031-67277-4
DOIs
Publication statusPublished - 14 Aug 2024
EventAIiH: International Conference on AI in Healthcare - Swansea, United Kingdom
Duration: 4 Sept 20246 Sept 2024
https://aiih.cc/#:~:text=Late%2Dbreaking%20Abstract%20Submission%20is,to%20Friday%206%20September%202024.

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume14975
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceAIiH: International Conference on AI in Healthcare
Country/TerritoryUnited Kingdom
CitySwansea
Period4/09/246/09/24
Internet address

Keywords

  • Speech Biomarker
  • Interpretable Machine Learning
  • Voice Disorder Detection

Fingerprint

Dive into the research topics of 'Interpreting Pretrained Speech Models for Automatic Speech Assessment of Voice Disorders'. Together they form a unique fingerprint.

Cite this