Fast speaker adaption via maximum penalized likelihood kernel regression

Ivor W. Tsang, James T. James, Brian Mak, Kai Zhang and Jeffrey J. Pan

Abstract: Maximum likelihood linear regression (MLLR) has been a popular speaker adaptation method for many years. In this paper, we investigate a generalization of MLLR using nonlinear regression. Specifically, kernel regression is applied with appropriate regularization to determine the transformation matrix in MLLR for fast speaker adaptation. The proposed method, called maximum penalized likelihood kernel regression adaptation (MPLKR), is computationally simple and the mean vectors of the speaker adapted acoustic model can be obtained analytically by simply solving a linear system. Since no nonlinear optimization is involved, the obtained solution is always guaranteed to be globally optimal. The new adaptation method was evaluated on the Resource Management task with 5s and 10s of adaptation speech. Results show that MPLKR outperforms the standard MLLR method.

Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), vol 1, pp.997-1000, Toulouse, France, May 2006.


Back to James Kwok's home page.