TY - JOUR
T1 - Further developments of a neural network speech fundamental period estimation algorithm
AU - Howard, I
PY - 1991/12/1
Y1 - 1991/12/1
N2 - This work describes a speech fundamental period estimation algorithm that estimates the time of excitation of the vocal tract using a pattern classifier, the multi-layer perceptron (MLP). The pattern classifier was trained using speech semi-automatically labelled by means of an algorithm that makes use of the output from a Laryngograph. Various issues arising in the training of the system were explored. Three basic configurations of the system were compared using different pre-processing strategies. It was found that processing the sampled speech time - waveform directly with the pattern classifier gave better results than using one of two filterbanks. The performance of the algorithm was evaluated against that of a simple peak-picking algorithm and the well known cepstrum algorithm using quantitative frequency contour comparisons. The performance of the new algorithm on a difficult set of test data was shown to be better than the peak-picker and comparable to the cepstrum algorithm. The advantage of the scheme is that fundamental period estimates are made on a period-by-period basis, thus preserving the irregularity in the speech excitation that is lost by techniques that produce as average period estimate. In addition, its simple structure lends itself to real-time implementation (Howard & Walliker, 9; Walliker & Howard, 14).
AB - This work describes a speech fundamental period estimation algorithm that estimates the time of excitation of the vocal tract using a pattern classifier, the multi-layer perceptron (MLP). The pattern classifier was trained using speech semi-automatically labelled by means of an algorithm that makes use of the output from a Laryngograph. Various issues arising in the training of the system were explored. Three basic configurations of the system were compared using different pre-processing strategies. It was found that processing the sampled speech time - waveform directly with the pattern classifier gave better results than using one of two filterbanks. The performance of the algorithm was evaluated against that of a simple peak-picking algorithm and the well known cepstrum algorithm using quantitative frequency contour comparisons. The performance of the new algorithm on a difficult set of test data was shown to be better than the peak-picker and comparable to the cepstrum algorithm. The advantage of the scheme is that fundamental period estimates are made on a period-by-period basis, thus preserving the irregularity in the speech excitation that is lost by techniques that produce as average period estimate. In addition, its simple structure lends itself to real-time implementation (Howard & Walliker, 9; Walliker & Howard, 14).
UR - https://pearl.plymouth.ac.uk/context/secam-research/article/1713/viewcontent/How91.pdf
M3 - Conference proceedings published in a journal
SN - 0537-9989
VL - 0
SP - 340
EP - 344
JO - IEE Conference Publication
JF - IEE Conference Publication
IS - 349
ER -