Publications des agents du Cirad

Cirad

Automatic biomedical term polysemy detection

Lossio-Ventura J.A., Jonquet C., Roche M., Teisseire M.. 2016. In : Calzolari Nicoletta (ed.), Choukri Khalid (ed.), Declerck Thierry (ed.), Goggi Sara (ed.), Grobelnik Marko (ed.), Maegaard Bente (ed.), Mariani Joseph (ed.) , Mazo Hélène (ed.), Moreno Asuncion (ed.), Odijk Jan (ed.), Piperidis Stelios (ed.). LREC 2016 Proceedings. Portoroz : ELRA, p. 1684-1688. International Conference on Language Resources and Evaluation (LREC 2016). 10, 2016-05-23/2016-05-28, Portoroz (Slovénie).

Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a novel approach to detect if a biomedical term is polysemic, with the long term goal of enriching biomedical ontologies. This approach is based on the extraction of new features. In this context we propose to extract features following two manners: (i) extracted directly from the text dataset, and (ii) from an induced graph. Our method obtains an Accuracy and F-Measure of 0.978.

Documents associés

Communication de congrès

Agents Cirad, auteurs de cette publication :