Dr. Rachel Schwartz’s Lab published two papers using our Andromeda cluster. She was also just awarded a new 3-year NSF grant titled “Disentangling biological and environmental drivers of diversification in the Andean flora” that will use of Andromeda for the genomic analysis.
Modak, T., R. Literman, J. Puritz, K. Johnson, E. Roberts, D. Proestou, X. Guo, M. Gomez-Chiarri, and R.S. Schwartz. Exceptional genome wide copy number variation in the eastern oyster (Crassostrea virginica). Proceedings of the Royal Society B. (2021)
Abstract: Genomic structural variation is an important source of genetic and phenotypic diversity, playing a critical role in evolution. The recent availability of a high-quality reference genome for the eastern oyster, Crassostrea virginica, and whole-genome sequence data of samples from across the species range in the USA, provides an opportunity to explore structural variation across the genome of this species. Our analysis shows significantly greater individual-level duplications of regions across the genome than that of most model vertebrate species. Duplications are widespread across all ten chromosomes with variation in frequency per chromosome. The eastern oyster shows a large interindividual variation in duplications as well as particular chromosomal regions with a higher density of duplications. A high percentage of duplications seen in C. virginicalie completely within genes and exons, suggesting the potential for impacts on gene function. These results support the hypothesis that structural changes may play a significant role in standing genetic variation in C. virginica, and potentially have a role in their adaptive and evolutionary success. Altogether, these results suggest that copy number variation plays an important role in the genomic variation of C. virginica. This article is part of the Theo Murphy meeting issue ‘Molluscan genomics: broad insights and future directions for a neglected phylum’.
Literman, R., and R.S. Schwartz. Genome-scale profiling reveals higher proportions of phylogenetic signal in non-coding data. Molecular Biology and Evolution. (2021)
Abstract: Many evolutionary relationships remain controversial despite whole-genome sequencing data. These controversies arise, in part, due to challenges associated with accurately modeling the complex phylogenetic signal coming from genomic regions experiencing distinct evolutionary forces. Here, we examine how different regions of the genome support or contradict well-established relationships among three mammal groups using millions of orthologous parsimony-informative biallelic sites (PIBS) distributed across primate, rodent, and Pecora genomes. We compared PIBS concordance percentages among locus types (e.g. coding sequences (CDS), introns, intergenic regions), and contrasted PIBS utility over evolutionary timescales. Sites derived from noncoding sequences provided more data and proportionally more concordant sites compared with those from CDS in all clades. CDS PIBS were also predominant drivers of tree incongruence in two cases of topological conflict. PIBS derived from most locus types provided surprisingly consistent support for splitting events spread across the timescales we examined, although we find evidence that CDS and intronic PIBS may, respectively and to a limited degree, inform disproportionately about older and younger splits. In this era of accessible wholegenome sequence data, these results:1) suggest benefits to more intentionally focusing on noncoding loci as robust data for tree inference and 2) reinforce the importance of accurate modeling, especially when using CDS data.