Definition : The chunk is the smallest linguistic’s item string possible, and a non recursif component. It include a head and other items that depends of it. The division in chunks allow a first segmentation that show some potential macro-syntaxic borders. This type of annotation is particularly usefull for spoken language. As we know spoken language can be hard to analise syntactically, specially because of those non ended sentences.

Example from Clapi :


Example from ESLO2 :



Categories : 

tag chunk type corpora’s examples
AP chunk adjectival trop [AP 0] joli [AP 1]
AdP chunk adverbial peut-être [AdP 0]
NP chunk nominal tes [NP 0] chaussures [NP 1]
PP chunk prépositionnel de [PP 0] loin[PP 1]
VP chunk verbal on [VP 0] nous [VP 1] entend [VP 2]
ARTIC articulateur et [ARTIC 0]
FNO forme noyau salut [FNO 0]
UNKNOW chunk non-identifié (amorce, mot inconnu) le [UNKNOW]

The annotation was done by Marie Skrovec and Iris Eshkol



Link to bibliography

Link to publication