|
|
|
Deepak, Akshay, Fernández-Baca, David, Tirthapura, Srikanta, Sanderson, Michael J and McMahon, Michelle M (2011) EvoMiner: Frequent Subtree Mining in Phylogenetic Databases. Technical Report 11-08, Computer Science, Iowa State University.
Abstract
The problem of mining collections of trees to iden-
tify common patterns, called frequent subtrees (FSTs),
arises often when trying to make sense of the results
of phylogenetic analysis. FST mining generalizes the
well-known maximum agreement subtree problem. Here
we present EvoMiner, a new algorithm for mining
frequent subtrees in collections of phylogenetic trees.
EvoMiner is an Apriori-like level-wise method, which
uses a novel phylogeny-specic constant-time candi-
date generation scheme, an ecient ngerprinting-based
technique for downward closure operation, and a lowest
common ancestor based support counting step that re-
quires neither costly subtree operations nor database
traversal. As a result of these techniques, our algo-
rithm achieves speed-ups of up to 100 times or more
over Phylominer, another algorithm for mining phyloge-
netic trees. EvoMiner can also work in vertical mining
mode, to use less memory at the expense of speed.
Contact site administrator at: ssg@cs.iastate.edu
|