Analyzing Genetic Diversity of Thermophilic Ammonia Oxidizing Archaea (ThAOA) Populations in Geothermal Hot Springs Using Metagenomic Datasets
By: Carlos Gomez
Department: Biology
Faculty Advisor: Dr. José R. De La Torre
Metagenomic datasets contain rich information on the genetic variation of microorganisms within the sampled community. Such variation is often concentrated in genomic regions, known as flexible genomic islands, that vary between individual cells within the population. In this study, we took a computational approach to identify and characterize flexible genomic islands in a lineage of thermophilic archaea found in terrestrial hot springs. These flexible genes represent the variability in physiology and adaptation present within the microbial population. Using metagenomic datasets from samples at different temperatures (60°C, 70°C, & 85°C) from a single site in Nevada, we mapped individual metagenomic short reads to our thermophilic ammonia-oxidizing archaeon (ThAOA) reference genome, Nitrosocaldus gerlachensis(GBSF). By calculating the density of mapped reads in different regions of the reference GBSF genome, we’ve identified regions that are conserved in all representatives in the population (known as the core genome), as well as regions of variability (flexible genome). In addition to these metagenomic reads recruitments, we have also carried out de novo assemblies of the three metagenomes. Binning of the resulting contigs identified metagenome-assembled genomes (MAGs) corresponding to close relatives of GBSF in these datasets. Genomic comparisons between these MAGs and our reference genomes provided a different approach to identifying flexible genomic regions unique to individual strains in the hot spring. By clustering all of the ThAOA according to their presence/absence in the reference genomes and MAGs, we identified gene clusters assigned to the core and flexible genomes. Computing the pangenome has allowed us to study the evolutionary impact of horizontal gene transfer and compare the gene content of the strains. By comparing genes found in the variable genome in samples of different temperatures we hope to understand the adaptation and evolution of uncultured microbial communities that play a key role in the nitrogen cycle.