Data Scientist: genomics | bioinformatics | phylogenomics
Genomics and Phylogenomics
I have developed two approaches to generate genomic data quickly and cost effectively. The first approach is based on microfluidic PCR and benefits from the high-throughput of the Fluidigm Access Array System—an instrument capable of amplify ~2300 PCRs simultaneously. For this method, I designed 96 primer pairs that were used to amplify 48 nuclear and 48 chloroplast loci in over 500 samples of Neobartsia. This resulted in a molecular matrix with multiple individuals per species and almost no missing data, ideal for coalescent-based species tree methods(Uribe-Convers et al., 2016).
The second approach uses 16 universal overlapping primer combinations and long-PCR to amplify complete chloroplast genomes. These amplicons are then sequenced on a high-throughput sequencer (e.g., Illumina) and the reads assembled into complete plastomes. We tested this method in 12 orders of angiosperms and one order of gymnosperms showing the universality of the primers and the efficiency of the method(Uribe-Convers et al., 2014).
I have also worked with sequence capture, designing baits (capture probes) for over 800 nuclear loci that were used in studies in Burmeistera and the Centropogonid clade in Campanulaceae. My work included designing the baits from transcriptome data and shotgun libraries, to process the raw sequence data, and to analyze it in a phylogenomic and coalescent context.
Lastly, I’m working with raw Genotyping by Sequencing (GBS) data as part of my collaboration in the Tanager Bird project. This includes cleaning and filtering the data and conducting the phylogenetic, biogeographic, and population genetic studies.
Systematics and Evolution
For most of my academic career (undergraduate and Ph.D.) I’ve worked on the genus Bartsia (Orobanchaceae), doing extensive fieldwork in the Andes in South America, molecular systematics and phylogenetics, and genomics and phylogenomics. One of the results in this group showed that the genus is not monophyletic and that led to a new taxonomic classification in which the new genus Neobartsia was described. Furthermore, I showed that the Andean species are diversifying much faster than other species in the Rhinantheae clade—a group of ~500 species—and that this is associated with the biogeographic movement from the Mediterranean Region and the colonization of the páramo ecosystem ~1.5 million years ago. I’m currently using coalescent-based species tree methods to elucidate the relationships in Neobartsia and to study the accumulation of lines of evidence for speciation in the group. This is in collaboration withDr. David Tank.
I’m studying the phylogeographic patterns and population dynamics of Tanager birds (Thraupidae) in the Greater and Lesser Antilles. We are particularly interested in the gene flow between islands and in the expansion and contraction of population sizes over macroevolutionary time. This is in collaboration withDr. Robert Ricklefs.
Finally, I’ve also worked on the genus Burmeistera (Campanulaceae) for my postdoc at UMSL. I was interested in the interspecific relationships among the ~120 species in the genus, their pollination syndromes, and the possible barriers (pre- or post-zygotic) that prevent hybrids to occur in nature. By hand-pollinating different species of Burmeistera in the greenhouse, I investigated what type of barriers exist and if they were associated with how closely related the species are. This was in collaboration withDr.NathanMuchhala.
Ancient DNA from the Clarkia Fossil Bed
I have been working on extracting, amplifying, and sequencing DNA from Miocene (~15 MA) plant tissue. The Clarkia fossil bed, located in Clarkia, Idaho, has the perfect conditions needed to preserve unoxidized plant tissue that is suitable for molecular techniques.
You can watch a video of how we lift the tissue below and the episode of “Plants Are Cool, Too!” where this work got featured. Super Cool!