KmerCity and AdmixKmer: Repeated K-mers-based tools for analyzing ancestry in complex admixed genomes
Garsmeur O., Rio S., Pompidor N., D'Hont A.. 2026. In : Environmental and Agronomical Genomis Symposium, February 18-20, 2026 : Books of abstracts. Paris : France Géonomique, p. 57. Environmental and Agronomical Genomics Symposium (EAGS 2026), 2026-02-18/2026-02-20, Paris (France).
We developed an approach based on the distribution of repeated k-mers derived from whole-genome sequencing data to analyze genome ancestry in complex admixed and/or polyploid genomes. This approach is based on the premise that highly repeated k-mers predominantly correspond to transposable elements (TEs), which, due to their lineage-specific activity, generate traceable genomic signatures that can serve as markers of hybridization events. Two computational tools, KmerCity and AdmixKmer were developed: KmerCity extracts and selects repeated k-mers from whole-genome sequencing data (WGS) and builds a repeated k-mers count matrix that enables the comparative analysis of k-mers distribution among accessions. The shared k-mer profiles are visualized using a graph-based approach, which allow the identification of genus-, species-, or subgroup-specific k-mer signatures. These signatures can then be used to detect admixed accessions. AdmixKmer is a method adapted from classic admixture models (like STRUCTURE and ADMIXTURE). It exploits the repeated k-mers count matrix generated by KmerCity to estimate proportions of ancestral contributions in the panels of analyzed accessions. Both tools do not require reference genome assembly and are alignment-free methods making them broadly applicable to species with limited genomic resources. Because they rely on repetitive k-mers, they require limited sequencing depth, which is of particular interest for polyploid species. They allowed the detection of small ancestral contributions even in the absence of pure representatives of the contributor. KmerCity and AdmixKmer should be particularly useful for characterizing genomes and populations combining admixture and polyploidy.
Documents associés
Communication de congrès
Agents Cirad, auteurs de cette publication :
- D'Hont Angélique — Bios / UMR AGAP
- Garsmeur Olivier — Bios / UMR AGAP
- Rio Simon — Bios / UMR AGAP
