A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for extremely low genetic variation in wild and cultivated germplasm
Scalabrin S., Toniutti L., Di Gaspero G., Scaglione D., Magris G., Vidotto M., Pinosio S., Cattonaro F., Magni F., Jurman I., Cerutti M., Suggi Liverani F., Navarini L., Del Terra L., Pellegrino G., Ruosi M.R., Vitulo N., Valle G., Pallavicini A., Graziosi G., Klein P.E., Bentley N., Murray S.C., Solano W., Al Hakimi A., Schilling T., Montagnon C., Kotch G., Bertrand B., Morgante M.. 2021. In : 28th Conference of Association for the Science and Information on Coffee - Books of abstracts. Montpellier : ASIC, p. 102. Conference of Association for the Science and Information on Coffee (ASIC 2021). 28, 2021-06-28/2021-07-01, Montpellier (France).
RATIONALE - The genome of the allotetraploid species Coffea arabica L. was sequenced to assemble independently the two component subgenomes (putatively deriving from C. canephora and C. eugenioides) and to perform a genome-wide analysis of the genetic diversity in cultivated coffee germplasm and in wild populations growing in the center of origin of the species. METHODS - We studied an individual of C. arabica 'Bourbon Vermelho'. A BAC library of 175,872 BAC clones was constructed and sequenced using an Illumina HiSeq2000. Each BAC pool was assembled independently with the tool ABySS and scaffolded with SSPACE. Genotyping by sequencing (GBS) was conducted using the restriction enzyme PstI followed by single-end sequencing on an Illumina HiSeq2000. SNP calling was performed using Stacks. Principal Component Analysis was performed using the R package ade4. A hierarchical study of the diversity has been conducted using a model-based clustering procedure with admixture as implemented in STRUCTURE. RESULTS - We assembled a total length of 1.536 Gbp, 444 Mb and 527 Mb of which were assigned to the canephora and eugenioides subgenomes, respectively, and predicted 46,562 gene models, 21,254 and 22,888 of which were assigned to the canephora and to the eugenioides subgenome, respectively. Through a genome-wide SNP genotyping of 736 C. arabica accessions, we analyzed the genetic diversity in the species and its relationship with geographic distribution and historical records. CONCLUSIONS & PERSPECTIVES - We observed a weak population structure due to low-frequency derived alleles and highly negative values of Taijma's D, suggesting a recent and severe bottleneck, most likely resulting from a single event of polyploidization, not only for the cultivated germplasm but also for the entire species. This conclusion is strongly supported by forward simulations of mutation accumulation. However, PCA revealed a cline of genetic diversity reflecting a west-to-east geographical distrib
Documents associés
Communication de congrès
Agents Cirad, auteurs de cette publication :
- Bertrand Benoît — Bios / UMR DIADE
- Toniutti Lucile — Bios / UMR AGAP