Publications des agents du Cirad


Information retrieval for animal disease surveillance: a pattern-based approach

Valentin S., Lancelot R., Roche M.. 2020. In : Holderness Eben (ed.), Yepes Antonio Jimeno (ed.), Lavelli Alberto (ed.), Lavelli Anne-Lyse (ed.), Pustejovsky James (ed.), Rinaldi Fabio (ed). Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis. Stroudsburg : Association for Computational Linguistics, p. 70-78. (LOUHI, 2020). International Workshop on Health Text Mining and Information Analysis, 2020-11-16/2020-11-20.

DOI: 10.18167/DVN1/YGAKNB

DOI: 10.18653/v1/2020.louhi-1.8

Animal diseases-related news articles are rich in information useful for risk assessment. In this paper, we explore a method to automatically retrieve sentence-level epidemiological information. Our method is an incremental approach to create and expand patterns at both lexical and syntactic levels. Expert knowledge input are used at different steps of the approach. Distributed vector representations (word embedding) were used to expand the patterns at the lexical level, thus alleviating manual curation. We showed that expert validation was crucial to improve the precision of automatically generated patterns.

