Abstract
The research work undertaken and presented in this thesis develops a low cost digital vocoder based on the principle of linear predictive coding (LPC). This system eliminates many of the complexities which have evolved in LPC vocoders over recent years while preserving good quality speech at low bit rates.In conventional LPC systems voiced speech is analysed in fixed frames of approximately twenty milliseconds duration. These frames, which can cover several pitches, must be windowed to ensure stable LPC coefficients. The system developed takes as its source data for voiced speech a single pitch period, which for analysis is assumed periodic. This not only eliminates spectral distortion caused by windowing but also any spectral blurring due to pitch variations over the frame.
Both voiced and unvoiced speech is analysed and synthesised in a lattice structure on the TMS32010. Spectral complexity usually necessitates a 10th order filter for voiced speech and a 6th order filter for unvoiced speech. It has been found that under certain circumstances these requirements can be relaxed and the filter length reduced. This results in a variable length filter system which can be implemented with confidence in fixed point arithmetic by monitaring the residual error at the analysis stage.
A pre-requisite for the periodic pitch autocorrelation technique used in voiced speech analysis is a fast, reliable and accurate pitch detection algorithrn. Several pitch detectors were investigated and developed to a stage where their performance could be assessed. The most favourable of these was based on feature extraction using the glottal impulse as its primary source of detection. This basic technique was developed using additional features found in voiced speech to give a quick and reliable pitch detector capable of locating the start and end of each pitch in real time.
The software for the vocoder has been written in assembler to operate in real time on the TMS32010 and tested in detail on the IBM using the high level language of fortran. Objective and subjective results are presented to indicate the quality, naturalness and intelligibility of the synthetic speech. Further developments necessary to implement the system are considered together with various refinements for its enhancement.
Date of Award | 1990 |
---|---|
Original language | English |
Awarding Institution |
|