logo

Simulating and estimating the effect of genetransfer on bacterial pangenomes

NOTE

  • Master Thesis Bioinformatics at the University of Tübingen
  • Thesis period: 01.12.2023 - 01.06.2024

Horizontal gene transfer (HGT) plays a significant role in shaping the genetic landscape of bacterial populations. In contrast to the more common vertical gene transfer, horizontal gene transfer allows the lateral exchange of genes. To study the impact of HGT on bacterial gene frequency spectra, we have extended existing mutation models within the open-source software msprime 1 2 by incorporating a gene gain and loss model using the Infinitely Many Genes model 3 approach. The ancestry and mutation simulation is then extended to support HGT events. Additionally, the model is adjusted to fix its otherwise random ancestry simulation to specified trees, which is essential for parameter estimation and fitting the simulation to real data. We then develop an innovative simulation-based testing framework to determine whether a gene frequency spectrum results from neutral evolution. Finally, this framework is validated, and real-world parameters are estimated using pangenome data.

TIP

A ready to use Jupyter Notebook with examples can be found here: example_usage.ipynb

/not-a-feature/pangenome-gene-transfer-simulation

Footnotes

  1. https://tskit.dev/software/msprime.html

  2. Franz Baumdicker et al. “Efficient ancestry and mutation simulation with msprime 1.0”. In: Genetics 220.3 (Dec. 2021). Ed. by S Browning.issn: 1943-2631. doi: 10.1093/genetics/iyab229. url: http://dx.doi.org/10.1093/genetics/iyab229

  3. Franz Baumdicker, Wolfgang R. Hess and Peter Pfaffelhuber. “The Infinitely Many Genes Model for the Distributed Genome of Bacteria”. In: Genome Biology and Evolution 4.4 (2012), pp. 443–456. doi: 10.1093/gbe/evs016. url: http://dx.doi.org/10.1093/gbe/evs016