Publications des agents du Cirad


A lightweight and multilingual framework for crisis information extraction from Twitter data

Interdonato R., Guillaume J.L., Doucet A.. 2019. Social Network Analysis and Mining, 9 : 20 p..

DOI: 10.1007/s13278-019-0608-4

Obtaining relevant timely information during crisis events is a challenging task that can be fundamental to handle the consequences deriving from both unexpected events (e.g., terrorist attacks) and partially predictable ones (i.e., natural disasters). Even though microblogging-based online social networks (e.g., Twitter) have become an attractive data source in these emergency situations, overcoming the information overload deriving from mass events is not trivial. The aim of this work was to enable unsupervised extraction of relevant information from Twitter data during a crisis event, offering a lightweight alternative to learning-based approaches. The proposed lightweight crisis management framework integrates natural language processing and clustering techniques in order to produce a ranking of tweets relevant to a crisis situation based on their informativeness. Experiments carried out on six Twitter collections in two languages (English and French) proved the significance and the flexibility of our approach.

Mots-clés : crise économique; catastrophe; réseaux sociaux; fouille de textes; fouille de données; analyse de données; traitement des données; traitement de l'information; twitter

Documents associés

Article (b-revue à comité de lecture)

Agents Cirad, auteurs de cette publication :