SPS22-6GL

Using Long Read Technology to Access Genetic Variation in Natural Thermophilic Ammonia-Oxidizing Archaea Populations

By: Andre Williams

Department: Cellular & Molecular Biology

Faculty Advisor: Dr. José R. de la Torre

Comparative genomics of microbial species has given us insight into the mechanisms and drivers of evolution. However, many of these studies, relying on the comparison of genomes from individual cultivated strains, do not adequately assess genomic variation at the level of entire, natural populations. The de la Torre lab studies the ecology and evolution of a group of thermophilic archaea, known as ammonia-oxidizing archaea (ThAOA), found in terrestrial hot springs around the world. To date, our lab has sequenced the genomes of 20 ThAOA strains from hot springs in the United States and China. Work by a previous student in our lab had pioneered the use of metagenomic datasets, in which the genomic DNA from an entire microbial community is sequenced, to assess the genetic diversity within natural populations of ThAOA. However, this approach has been limited by the sequencing technology used: short sequence reads of 100 to 250 nucleotides. These short reads make it challenging to get a genome-wide perspective of how mutations and genes co-vary across the genome. Recent sequencing advances such as long-read sequencing, with reads greater than 10,000 bases, can overcome this limitation by sequencing longer pieces of DNA and allowing us to track the co-occurrence of mutations across greater genomic distances. Using lab protocol, I have assembled a metagenome using short read data from sediment samples in Great Boiling Springs (GBS). Once assembled, I binned the genomic data to specific organisms and annotated the genes. Next, I plan to use computational programs to find mutations that occur together or in phase. In tandem I will do similar processes using long read technology and compare the results. Being able to access mutation across longer stretches of DNA will help us understand genetic variation on a population scale.