Part of Advances in Neural Information Processing Systems 12 (NIPS 1999)
Howard Yang, Hynek Hermansky
In this paper, we use mutual information to characterize the dis(cid:173) tributions of phonetic and speaker/channel information in a time(cid:173) frequency space. The mutual information (MI) between the pho(cid:173) netic label and one feature, and the joint mutual information (JMI) between the phonetic label and two or three features are estimated . The Miller's bias formulas for entropy and mutual information es(cid:173) timates are extended to include higher order terms. The MI and the JMI for speaker/channel recognition are also estimated. The results are complementary to those for phonetic classification. Our results show how the phonetic information is locally spread and how the speaker/channel information is globally spread in time and frequency.