International Journal of Computer and Communication Technology
Abstract
Speaker specific information present in the excitation signal is mostly viewed from sub-segmental, segmental and supra-segmental levels. In this work, the supra-segmental level information is explored for recognizing speakers. Earlier study has shown that, combined use of pitch and epoch strength vectors provides useful supra-segmental information. However, the speaker recognition accuracy achieved by supra-segmental level feature is relatively poor than other levels source information. May be the modulation information present at the supra-segmental level of the excitation signal is not manifested properly in pith and epoch strength vectors. We propose a method to model the supra-segmental level modulation information from residual mel frequency cepstral coefficient (R-MFCC) trajectories. The evidences from R-MFCC trajectories combined with pitch and epoch strength vectors are proposed to represent supra-segmental information. Experimental results show that compared to pitch and epoch strength vectors, the proposed approach provides relatively improved performance. Further, the proposed supra-segmental level information is relatively more complimentary to other levels information.
Recommended Citation
Pati, Debadatta and Prasanna, S. R. Mahadeva
(2013)
"Speaker Recognition using Supra-segmental Level Excitation Information,"
International Journal of Computer and Communication Technology: Vol. 4:
Iss.
1, Article 5.
DOI: 10.47893/IJCCT.2013.1165
Available at:
https://www.interscience.in/ijcct/vol4/iss1/5
DOI
10.47893/IJCCT.2013.1165