Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

AbstractA recent result shows that inner speech can, with proper care, be decoded to the same high-level of accuracy as articulated speech. This relies, however, on neural data obtained while subjects perform elicited tasks, such as covert reading and repeating, whereas a neural speech prosthetic will require the decoding of inner speech that is self-generated. Prior work has, moreover, emphasised differences between these two kinds of inner speech, raising the question of how well a decoder optimised for one will generalise to the other. In this study, we trained phoneme-level decoders on an atypically large, elicited inner speech dataset, previously acquired using 7T fMRI in a single subject. We then acquired a second self-generated inner speech dataset in the same subject. Although the decoders were trained exclusively on neural recordings obtained during elicited inner speech, they predicted unseen phonemes accurately in both elicited and selfgenerated test conditions, illustrating the viability of zero-shot task transfer. This has significant practical importance for the development of a neural speech prosthetic, as labelled data is far easier to acquire at scale for elicited than for self-generated inner speech. Indeed, elicited tasks may be the only option for acquiring labelled data in critical patient populations who cannot control their vocal articulators.

Original publication

DOI

10.1101/2021.05.23.445249

Type

Journal article

Publisher

Cold Spring Harbor Laboratory

Publication Date

25/05/2021