Our project takes into account several perspectives of oral language to proceed segmentation task :
- a first segmentation level for French based on maximal units according to the Orfeo ANR guidelines
- a simplified segmentation and annotation of maximal syntactic units in German
- a segmentation in chunks for French with the methodological approach developped in the LLL team
- an online annotation experiment in German in which pauses are annotated with respect to their likelihood of being a segment boundary
- a segmentation in prosodic prominences in French according to the Rhapsody ANR guidelines and discussed with Mathieu Avanzi during our workshop on prosody
- Syntax
- Microsyntax : a segmentation in microsyntactic units in French.
- Macrosyntax : a segmentation in macrosyntactic units in French based on a methodological approach developped in SegCor taking on account both the Rhapsody ANR project and the Orfeo ANR project but with different propositions on the gestion of the nuclei
- Illocutionary units : a pragmatic segmentation scheme based on the French macrosyntactic units adapted to the German language
- a segmentation and annotation of syntactic units on several (hierarchichally dependent) layers:
- Topological fields
- Position of the finite verb
- Maximal syntactic unit
- a segmentation in German based on phenomena typical for spoken language
- an explorating segmentation in TRP (Transition-Relevance Place) and actions to propose an interactional segmentation for Multi-unit turns in concertation for German and French
All resources (corpus, annotations, guides, tools) are reusable identically or modifiable but for non-commercial uses and with citation of the source (SegCor, http://segcor.cnrs.fr for annotations / guides; Eslo for the corpus Eslo http://eslo.huma-num.fr/ or Clapi for the corpus http://clapi.icar.cnrs.fr; and http://segcor.cnrs.fr/deliverable/tools/ for the tools), they can be redistributed under the same conditions according to the Creative Common 4.0 International license ( CC BY-NC-SA 4.0, https://creativecommons.org/licenses/by-nc-sa/4.0/).