Spoken Letter Recognition

Part of Advances in Neural Information Processing Systems 3 (NIPS 1990)

Bibtex Metadata Paper


Mark Fanty, Ronald Cole


Through the use of neural network classifiers and careful feature selection, we have achieved high-accuracy speaker-independent spoken letter recog(cid:173) nition. For isolated letters, a broad-category segmentation is performed Location of segment boundaries allows us to measure features at specific locations in the signal such as vowel onset, where important information resides. Letter classification is performed with a feed-forward neural net(cid:173) work. Recognition accuracy on a test set of 30 speakers was 96%. Neu(cid:173) ral network classifiers are also used for pitch tracking and broad-category segmentation of letter strings. Our research has been extended to recog(cid:173) nition of names spelled with pauses between the letters. When searching a database of 50,000 names, we achieved 95% first choice name retrieval. Work has begun on a continuous letter classifier which does frame-by-frame phonetic classification of spoken letters.