Novelty detection in event surveillance documents
Menya E., Interdonato R., Owuor D., Roche M.. 2026. IEEE Access, 14 : p. 29566-29589.
Event Based Surveillance (EBS) monitors online sources such as broadcast, print, web news and generates early warning and response (EWAR) signals for use in disaster mitigation. These online sources provide a dynamic data source allowing for potential real-time EBS updates. However, in dealing with news articles, fragmented information exists in varied sources and redundant information are known to overburden EBS. In this study we propose a Large Language Model based approach that filters out redundancies while learning novel information from event centric online news corpora. We study this novelty task for events covering animal health, food security and climate change surveillance domains. Our approach focuses on features integrating spatio-temporal information (such as location and date of event) and thematic information (such as the name of disease, food insecurity triggers, climate change magnitude). We characterize novelty as presence of new and additional information (e.g., a newly mentioned disease name or additional location information) as distinguished from duplicate (e.g., an already seen disease name) and missing (expected but absent) information. To this regard, our approach proposes fine-grained classification of novelty in event surveillance and language modeling adoption with a multi-class classification objective to learn classifying of event information. Our LLM adoption strategy proposes question-based prompts whose extracted answers map to predefined feature types (e.g., location, date, name of disease) in order to enrich our classifier. In our empirical studies, we present comparative analysis with respect to language models and large language models for State-Of-The-Art performance in the event novelty classification task. Our findings demonstrates the ability of cross-domain novelty classification with our model EpidGPT (few-shot) achieving F1% scores of 82.3, 85.49 and 88.97 in animal health, food security and climate change domains while fi
Mots-clés : changement climatique; santé animale; sécurité alimentaire; fouille de textes; surveillance épidémiologique; aliment nouveau; maladie des animaux
Documents associés
Article (a-revue à facteur d'impact)
Agents Cirad, auteurs de cette publication :
- Interdonato Roberto — Es / UMR TETIS
- Menya Edmond — Es / UMR TETIS
- Roche Mathieu — Es / UMR TETIS
