Publications des agents du Cirad


Component-based regularization of a multivariate GLM with a thematic partitioning of the explanatory variables

Bry X., Trottier C., Mortier F., Cornu G.. 2020. Statistical Modelling, 20 (1) : p. 96-119.

DOI: 10.1177/1471082X18810114

We address component-based regularization of a multivariate generalized linear model (GLM). A vector of random responses Y is assumed to depend, through a GLM, on a set X of explanatory variables, as well as on a set A of additional covariates. X is partitioned into R conceptually homogenous variable groups X1,¿,XR, viewed as explanatory themes. Variables in each Xr are assumed many and redundant. Thus, generalized linear regression demands dimension reduction and regularization with respect to each Xr. By contrast, variables in A are assumed few and selected so as to demand no regularization. Regularization is performed searching each Xr for an appropriate number of orthogonal components that both contribute to model Y and capture relevant structural information in Xr. To estimate a single-theme model, we first propose an enhanced version of Supervised Component Generalized Linear Regression (SCGLR), based on a flexible measure of structural relevance of components, and able to deal with mixed-type explanatory variables. Then, to estimate the multiple-theme model, we develop an algorithm encapsulating this enhanced SCGLR: THEME-SCGLR. The method is tested on simulated data and then applied to rainforest data in order to model the abundance of tree species.

Mots-clés : angola; burundi; cameroun; gabon; république centrafricaine; congo; république démocratique du congo; rwanda; république-unie de tanzanie; zambie

Documents associés

Article (a-revue à facteur d'impact)

Agents Cirad, auteurs de cette publication :