Du

Horaire -

Lieu LISN Site Belvédère

Séminaires, STL

Methodology for spontaneous speech corpora compilation

Orateur : Tommaso Raso (Federal University of Minas Gerais, Brazil)

I will present the methodology of spontaneous speech corpora compilation for the C-ORAL family corpora. I will discuss the importance of the diaphasic variation for spontaneous speech data, and describe the corpora architecture. Then, I will discuss the different phases of the corpus compilation, and the methodologic issues they present: recording, transcription, prosodic segmentation, revision, alignment, and statistical validation. Special attention will be given to prosodic segmentation: why it is important and how it can be performed.

Lieu de l'événement