Overview of LifeCLEF plant identification task 2020

Goeau H., Bonnet P., Joly A.. 2020. In : Cappellato Linda (ed.), Eickhoff Carsten (ed.), Ferro Nicola (ed.), Névéol Aurélie (ed.). Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. Aachen : CEUR-WS, 15p.. (CEUR Workshop Proceedings, 2696). Conference and Labs of the Evaluation Forum (CLEF 2020), 2020-09-22/2020-09-25, Thessaloniki (Grèce).

Automated identification of plants has improved considerably thanks to the recent progress in deep learning and the availability of training data with more and more photos in the field. However, this profusion of data only concerns a few tens of thousands of species, mostly located in North America and Western Europe, much less in the richest regions in terms of biodiversity such as tropical countries. On the other hand, for several centuries, botanists have collected, catalogued and systematically stored plant specimens in herbaria, particularly in tropical regions, and the recent efforts by the biodiversity informatics community made it possible to put millions of digitized sheets online. The LifeCLEF 2020 Plant Identification challenge (or "PlantCLEF 2020") was designed to evaluate to what extent automated identification on the flora of data deficient regions can be improved by the use of herbarium collections. It is based on a dataset of about 1,000 species mainly focused on the South America's Guiana Shield, an area known to have one of the greatest diversity of plants in the world. The challenge was evaluated as a cross-domain classification task where the training set consist of several hundred thousand herbarium sheets and few thousand of photos to enable learning a mapping between the two domains. The test set was exclusively composed of photos in the field. This paper presents the resources and assessments of the conducted evaluation, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.

