min_adcl finds a good collection of sequences to cut from a placefile’s ref tree.
usage: min_adcl [options] placefile
| --point-mass | Treat every pquery as a point mass concentrated on the highest-weight placement. |
| --pp | Use posterior probability for the weight. |
| -c | Reference package path. |
| -o | Specify the filename to write to. |
| --out-dir | Specify the directory to write files to. |
| --prefix | Specify a string to be prepended to filenames. |
| --no-csv | Output the results as a padded matrix instead of csv. |
| --node-numbers | Put the node numbers in where the bootstraps usually go. |
| --seed | Set the random seed, an integer > 0. Default is 1. |
| -v | If specified, write progress output to stderr. |
| -t | If specified, the path to write the trimmed tree to. |
| --leaves | The maximum number of leaves to keep in the tree. |
| --max-adcl | The maximum ADCL that a solution can have. |
| --algorithm | Which algorithm to use to prune leaves. Choices are ‘greedy’, ‘full’, ‘force’, and ‘pam’. Default full. |
| --all-adcls-file | |
| If specified, write out a csv file containing every intermediate computed ADCL. | |
| --log | If specified with the full algorithm, write out a csv file containing solutions at every internal node. |
| --always-include | |
| If specified, the leaf names read from the provided file will not be trimmed. | |
| --leaf-mass | Fraction of mass to be distributed uniformly across leaves. Default 0. |
Important
rppr min_adcl Prints the labels of the leaves that should be removed from the tree, not those that should be kept.
Chooses the set of k sequences X that minimize the average distance between each placement the closest sequence in X.
You can read more in the announcement or the paper.
See min_adcl_tree for the same equivalent operation on a tree without placements.