Publications des agents du Cirad

Cirad

Extracting absolute spatial entities from SMS: Comparing a supervised and an unsupervised approach

Lopez C., Zenasni S., Kergosien E., Partalas I., Roche M., Teisseire M., Panckhurst R.. 2018. In : Cougnon Louise-Amélie (ed.), De Cock Barbara (ed.), Fairon Cédrick (ed.). Language and the new (instant) media. Louvain-la-Neuve : Presses universitaires de Louvain, p. 15-22. (Cahiers du CENTAL, 9).

DOI: 10.18167/DVN1/0ZGJRC

More than one hundred thousand SMS messages are sent worldwide every second, and each SMS message is likely to contain lexical creativity. Recently, SMS content has been recognised to be of notable interest in many domains, such as e-commerce or psychiatry and more generally Health Informatics. But the automatic analysis of such data is difficult, particularly when dealing with information extraction. In this study, we will focus on “spatial entity recognition”, which consists of recognising countries, cities, places, bars, restaurants, cinemas, beaches, and so forth. For instance, Montpel, mtpl, mtp, and motpeliè all stand for the city of Montpellier. We will compare two different ways of tackling new forms of spatial entity recognition in SMS.

Documents associés

Chapitre d'ouvrage

Agents Cirad, auteurs de cette publication :