Evolution 2013: Informatics

24 Jun 2013

Nasrallah, Chris: Dynamics of Alternative Pathways to Compensatory Substitution

Crossing fitness valleys coming from correlated states: e.g. rRNA stems.

CTMC– get rates to/from AB, Ab, aB, ab. Trick from Hideki Yanan: collapse deleterious states into one state.

Simulations showing how paths through this CTMC change when changing parameters.

Effect of recombination is to decrease the probability of a double fixation.

Would like to turn these into something that could be useful for molecular evolution.

Wu, Steven; Sukumaran, Jeet; Ding, Yuantong; Rodrigo, Allen: Estimating evolutionary parameters and haplotypes simultaneouslyusing short-read sequences derived from genetically variable populations

What do we do if we have lots of short reads only? Try to reconstruct haplotypes then make tree? Too many haplotypes, and

A Bayesian approach. Short read likelihood– Binomial error model. Gives probability of short read on the various haplotypes. Operators = swap bases from short read sequences, swap section between two haplotypes, recombination. Swap bases is moving bases from short read onto haplotype.

Simulations with SimCoal and MetaSim. Align short reads with SHRiMP. They do a good job of re-finding effective population size.

Next- rjMCMC to sort out number of haplotypes.

Darriba, Diego; Taboada, Guillermo; Doallo, Ramon; Posada, David: Fast selection of multi-gene models of molecular evolution

How do we find partitions for a collections of genes such that each partition has a single model? How do we assign collections of models to partitioned data sets. Lanfear et al (2012) paper scores partitioning schemes with AIC, etc, and greedy and hierarchical clustering algorithms: partitionfinder. Simulation. Use Rand index to compare partitions. They re-implement the Lanfear method in “partitiontest”. This also does model assignment.

Ding, Yuantong; Wu, Steven; Sukumaran, Jeet; McConnell, Christie ; Rodrigo, Allen: Evolutionary analyses using reconstructed haplotypes from next-generation sequencing of mixed samples: an evaluation of current programs

Can we use short-read data to reconstruct trees?

SerialSimCoal to simulate tree. MetaSim to simulate short reads.

ShoRAH, ViSpA, PredictHaplo to reconstruct haplotypes, a number of alignment programs as well. < 1 % of simulations successfully recovered the correct numbers of original haplotypes.

Problem: have to compare unlabeled trees. Use tree balance statistics.

Problem: different sized trees. Use maximum PD to find the same sized trees.

Overall, bad performance for making trees and for population parameters. Work is needed here.