convexify identifies minimal leaf set to cut for taxonomic concordance.
usage: convexify [-c my.refpkg | --tree my.tre --colors my.csv]
-c | Reference package path. Required. |
--node-numbers | Put the node numbers in where the bootstraps usually go. |
--tree | A tree file in newick format to work on in place of a reference package. |
--colors | A CSV file of the colors on the tree supplied with –tree. |
-t | If specified, the path to write the discordance tree to. |
--cut-seqs | If specified, the path to write a CSV file of cut sequences per-rank to. |
--alternates | If specified, the path to write a CSV file of alternate colors per-sequence to. |
--check-all-ranks | |
When determining alternate colors, check all ranks instead of the least recent uncut rank. | |
--all-alternates | |
When determining alternate colors, ignore the taxononomy and show all alternates. | |
--cutoff | Any trees with a maximum badness over this value are skipped. Default: 12. |
--limit-rank | If specified, only convexify at the given ranks. Ranks are given as a comma-delimited list of names. |
--timing | If specified, save timing information for solved trees to a CSV file. |
--rooted | Strictly evaluate convexity; ensure that each color sits in its own rooted subtree. |
--naive | Use the naive convexify algorithm. |
--no-early | Don’t terminate early when convexifying. |
convexify applies an exact dynamic program to identify leaves of a phylogenetic tree that don’t agree with their taxonomic labels. You can read more in the announcement or the paper.
You can either specify a reference package or a tree and a CSV file of colors. The CSV file is formatted as follows:
leaf_name1,color1
leaf_name2,color2
where the colors are just strings.
Note that --no-early and --naive don’t change the results. They just (much) run more slowly for all but the most trivial problems.