Variation in the structure of genomes is a crucial component of genetic innovation and an important factor in the evolution of genome complexity. Such structural variation (SV) includes mutations like duplications, deletions, translocations, inversions, and transpositions. SV is ubiquitous, creates new genes and new gene structures, alters gene expression, and in humans, contributes to genetic disease. While SVs contribute to adaptation, genetic drift has also been proposed to occupy a central role in the acquisition and evolution of SV. Consequently, the relative importance of mechanisms driving the acquisition of SVs remains controversial. When mutation limits adaptation, we predict that selection is more effective at changing allele frequencies in larger populations and genetic drift is more effective in smaller populations. However, when the mutation rate is sufficiently large to permit frequent adaptation, then levels of variation depend only weakly on population size. As a result, determining the extent to which and how population size is associated with patterns of genetic diversity within and between species can reveal much about what factors dominate evolution. To determine the relative influence of mutation, natural selection, and genetic drift on the evolution of genome structure, we are contrasting patterns of polymorphism and divergence between large and small populations.
To address the evolution of mutations that sculpt genome structure, we are comparing the evolution of SVs in pairs of Drosophila species with large and small population sizes. Our preliminary studies demonstrate that SV discovery is strongly limited by the quality of the reference genome. Draft genomes, which comprise most reference genomes, have poor power to discover SVs, missing up to 70% of variants, many of which are phenotypically important. Such barriers to SV discovery can severely mislead any evolutionary inferences made with poor references. In order to discover SVs within and between species, we are currently using methods developed in the lab to assemble long molecules (PacBio and Oxford Nanopore) into high quality reference genomes for species that either lack them or have only draft genomes. These reference genomes will serve as the basis for comparative genomics and population genomics studies that will uncover the rates of evolution, functional consequences, fitness effects, and the rates of adaptation of SVs.
We are investigating the genetic architecture of expression variation by studying different yeast strains and their F1 hybrids
]. Our current focus is on statistical model development for properly measuring allele specific expression and how to apply this to single cell data.
In collaboration with Brandon Gaut, we are using a highly replicated experimental evolution resource to examine the distribution of fitness of effects of adaptation to an extreme change in environment (namely culturing E. coli at 42C). We are starting by creating a cost-effective barcoding fitness assay that will be able to accommodate thousands of fitness assays in a single lane of sequencing.