Table Of Contents

Previous topic

Changelog

This Page

FAQ

How can I cite pplacer?

Please cite:

@article{matsen2010pplacer,
  title={pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree},
  author={Matsen, F.A. and Kodner, R.B. and Armbrust, E.},
  journal={BMC Bioinformatics},
  volume={11},
  number={1},
  pages={538},
  year={2010},
  publisher={BioMed Central Ltd}
}

If you use a method implemented in guppy (such as edge PCA) we hope you will cite the paper describing that method.

Why is pplacer taking up so much memory?

Because it caches likelihood vectors for all of the internal nodes. This is what makes it fast!

Pplacer uses about 1/4 of the memory when placing on a FastTree tree as compared to a RAxML tree inferred with GTRGAMMA. If your reads are short and in a fixed region, the memory used by pplacer v1.1 alpha08 (or later) scales with respect to the total number of non-gap columns in your query alignment. You can also make it use less memory (and run faster) by cutting down the size of your reference tree.

Additionally, if you are placing metagenomic reads onto a very wide alignment (such as a concatenation) read about the --groups feature in the documentation.

If all else fails, there is also a flag --mmap-file to pplacer. If this flag is passed, pplacer will use an mmapped file instead of making large allocations.

Details about mmap are covered in more detail in the pplacer details section.

Why doesn’t pplacer output a Newick tree?

Pplacer does not output a Newick tree because it does not build phylogenetic trees. It maps sequences into trees, with uncertainty, and thus its output format encodes those maps. You can read more about the pplacer/EPA output format and our motivations for it in the corresponding paper.

If you do need a newick tree, you can look at guppy’s tog command, but be careful, as the output is not a real phylogenetic tree. For example, two very similar query sequences that are both rather different than the reference sequences will end up on long parallel branches rather than in a subtree of size two.