Going deeper in the automated identification of Herbarium specimens
Carranza-Rojas J.M., Goeau H., Bonnet P., Mata-Montero E., Joly A.. 2017. BMC Evolutionary Biology, 17 : 14 p..
Background: Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. However, thousands of sheets are still unidentified at the species level while numerous sheets should be reviewed and updated following more recent taxonomic knowledge. These annotations and revisions require an unrealistic amount of work for botanists to carry out in a reasonable time. Computer vision and machine learning approaches applied to herbarium sheets are promising but are still not well studied compared to automated species identification from leaf scans or pictures of plants in the field. Results: In this work, we propose to study and evaluate the accuracy with which herbarium images can be potentially exploited for species identification with deep learning technology. In addition, we propose to study if the combination of herbarium sheets with photos of plants in the field is relevant in terms of accuracy, and finally, we explore if herbarium images from one region that has one specific flora can be used to do transfer learning to another region with other species; for example, on a region under-represented in terms of collected data. Conclusions: This is, to our knowledge, the first study that uses deep learning to analyze a big dataset with thousands of species from herbaria. Results show the potential of Deep Learning on herbarium species identification, particularly by training and testing across different datasets from different herbaria. This could potentially lead to the creation of a semi, or even fully automated system to help taxonomists and experts with their annotation, classification, and revision works.
Mots-clés : herbier; apprentissage machine; identification; imagerie; collection botanique; biodiversité; traitement des données
Documents associés
Article (a-revue à facteur d'impact)
Agents Cirad, auteurs de cette publication :
- Bonnet Pierre — Bios / UMR AMAP
- Goeau Hervé — Bios / UMR AMAP