An Automatic Prolongation Detection Approach in Continuous Speech With Robustness Against Speaking Rate Variations

Iman Esmaili, Nader Jafarnia Dabanloo, Mansour Vali

DOI:

Abstract


In recent years, many methods have been introduced for supporting the diagnosis of stuttering for automatic detection of prolongation in the speech of people who stutter. However, less attention has been paid to treatment processes in which clients learn to speak more slowly. The aim of this study was to develop a method to help speech-language pathologists (SLPs) during diagnosis and treatment sessions. To this end, speech signals were initially parameterized to perceptual linear predictive (PLP) features. To detect the prolonged segments, the similarities between successive frames of speech signals were calculated based on correlation similarity measures. The segments were labeled as prolongation when the duration of highly similar successive frames exceeded a threshold specified by the speaking rate. The proposed method was evaluated by UCLASS and self-recorded Persian speech databases. The results were also compared with three high-performance studies in automatic prolongation detection. The best accuracies of prolongation detection were 99 and 97.1% for UCLASS and Persian databases, respectively. The proposed method also indicated promising robustness against artificial variation of speaking rate from 70 to 130% of normal speaking rate.

Keywords


Attention; language; learning; pathologists; speech; speech-language pathology; stuttering

Full Text:

PDF

References


Starkweather C. Fluency and Stuttering. Englewood Cliffs, NJ: Prentice-Hall; 1987.

Adams MR. A clinical strategy for differentiating the normally nonfluent child and the incipient stutterer. J Fluen Disord 1977;2:141-8.

Conture EG. Stuttering. 2nd ed. Englewood Cliffs, NJ: Prentice Hall; 1990.

Yaruss JS. Clinical measurement of stuttering behaviors. Contemp Issues Commun Sci Disord 1997;24:33-44.

Curlee RF. Observer agreement on disfluency and stuttering. J Speech Lang Hear Res 1981;24:595-600.

Wisniewski M, Kuniszyk-Jo kowiak W, Smolka E, Suszynnski W. Automatic detection of prolonged fricative phonemes with the hidden Markov models approach. J Med Inform Technol 2007;11:293-8.

Suszynnski W, Kuniszyk-Jo kowiak W, Smolka E, Dziennkowski M. Prolongation detection with application of fuzzy logic. Ann UMCS Inform 2003;1:133-40.

Hariharan M, Fook CY, Sindhu R, Adoma AH, Yaacob S. Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy. Digit Signal Process 2013;23:952-9.

Swietlicka I, Kuniszyk-Jo kowiak W, Smolka E. Hierarchical ANN system for stuttering identification. Comput Speech Lang 2013;27:228-42.

Mahesha P, Vinod DS. Gaussian mixture model base classification

of stuttering dysfluencies. J Intell Syst 2015;25: 387-99.

Howell P, Sackin S. Automatic recognition of repetitions and prolongations in stuttered speech. Proceedings of the First World

Congress on Fluency Disorders, 1995. p. 372-4.

Ai OC, Hariharan M, Yaacob SB, Chee LS. Classification of speech

dysfluencies with MFCC and LPCC features. Expert Syst Apple

;39:2157-65.

Roth FP, Worthington CK. Treatment Resource Manual for Speech Language Pathology. 4th ed. Clifton Park, NY: Delmar; 2011.

Fook CY, Hariharan M, Chee LS, Yaacob SB, Adom AH. Comparison of speech parameterization techniques for classification of speech dysfluencies. Turk J Electr Eng Comput Sci 2013;21:1983-94.

Hermansky H. Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 1990;87:1738-52.

Howell P, Davis S, Bartrip J. The UCLASS archive of stuttered

speech. J Speech Lang Hear Res 2009;52:556-69.

de Andrade CR, Cervone LM, Sassi FC. Relationship between the

stuttering severity index and speech rate. Sao Paulo Med J 2003;121:81-4.

Pfau T, Ruske G. Estimating the speaking rate by vowel detection. Acoust Speech Signal Process 1998;2:945-8.

de Jong NH, Wempe T. Automatic measurement of speech rate in

spoken Dutch. ACLC Work Pap 2007;2:49-58.

Yuan J, Liberman M. Robust speaking rate estimation using broad phonetic class recognition. 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010. p. 4222-5.

Esmaili I, Jafarnia Dabanloo N, Vali M. Automatic classification of

speech dysfluencies in continuous speech based on similarity measures and morphological image processing tools. Biomed Signal Process 2016;23:104-14.

Duda RO, Hart PE, Stork DG. Pattern Classification. 2nd ed. New

York, NY: John Wiley and Sons; 2001.


Refbacks

  • There are currently no refbacks.


 

  https://e-rasaneh.ir/Certificate/22728

https://e-rasaneh.ir/

ISSN : 2228-7477