Publications des agents du Cirad

Cirad

Apprentissage par renforcement pour l'aide à la conduite des cultures des petits agriculteurs des pays du Sud : vers la maîtrise des risques

Gautron R.. 2022. Montpellier : Institut Agro Montpellier, 181 p.. Thèse de doctorat -- Apprentissage automatique appliqué à l'agronomie.

Crop management is the logical and ordered combination of agricultural operations applied to a field in order to obtain a particular crop production. Decisions about these operations are not straightforward as they occur in the face of uncertain events, such as weather events. After decades of development of computerized decision-making tools for crop management support, these specialized decision support systems (DSS) are still facing a poor adoption. DSS users deemed that information cannot directly be turned into actions, that farmers' natural decision-making processes are not adequately taken into account, that the sequential nature of decisions is poorly modeled or that risk management is lacking in the decision process. Reinforcement learning (RL), a branch of machine learning, addresses the control of uncertain and unknown dynamical systems. RL inherently deals with sequences of decisions with uncertain consequences, and shares some similarities with how farmers are described to address crop management, e.g. learning by trial and errors. Yet, very few applications of RL for crop management support are found. RL generally requires millions of interactions to solve simple decision problems compared to crop management. In this thesis, we study how RL can improve the decision support of crop management, focusing on smallholder farmers of southern regions. In this context, crop management support is even more challenging because of the data scarcity and high yield variability in rainfed cropping systems. We provide a generic method to turn crop models into standardized and easy to manipulate RL environments, which allow to extensively train RL agents at a negligible computational cost. In simulated conditions, we successfully learn sustainable crop practices with an RL algorithm. Yet, we show that for most applications, considering both a risk-neutral and risk-aware decision criterion, the statistical significance of the identification of best practices from model

Documents associés

Thèse