Hines, Andrew; Harte, Naomi (Sigmedia, Trinity College Dublin, Ireland)
Skoglund, Jan; Kokaram, Anil (Google, Inc., Mountain View, CA, USA)
A model of human speech quality perception has been developed to provide an objective measure for predicting subjective quality assessments. The Virtual Speech Quality Objective Listener (ViSQOL) model is a signal based full reference metric that uses a spectro-temporal measure of similarity between a reference and a test speech signal. This paper describes the algorithm and compares the results with PESQ for common problems in VoIP: clock drift, associated time warping and jitter. The results indicate that ViSQOL is less prone to underestimation of speech quality in both scenarios than the ITU standard.