Tree Clustering with Trevis

Trevis uses our hierarchical agglomerative clustering (HAC) implementation. Given a list of ContextTrees you can ask Trevis to produce a Dendrogram.

To cluster trees, store them in an ObservableList<ContextTree> and wrap that in a ContextTreeExperiment (a special type of HAC Experiment). Then you have to pick a tree distance measure and wrap it in a ContextTreeDissimilarityMeasure (a special kind of HAC DissimilarityMeasure). Finally, you have to pick a HAC agglomeration method.

After that, create a DendrogramBuilder (a ClusteringBuilder that produces a Dendrogram), create a clusterer, and ask it to cluster the trees. Then you can ask the DendrogramBuilder for the Dendrogram.

public class TreeClusteringExample {
  private Dendrogram cluster(final ObservableList<ContextTree> treeList) {
    final Experiment experiment = new ContextTreeExperiment(treeList);
    final DissimilarityMeasure dissimilarityMeasure = new ContextTreeDissimilarityMeasure(new NodeSetDistance());
    final AgglomerationMethod agglomerationMethod = new GroupAverageAgglomerationMethod();
    final DendrogramBuilder dendrogramBuilder = new DendrogramBuilder(experiment.getNumberOfObservations());
    final HierarchicalAgglomerativeClusterer clusterer = new HierarchicalAgglomerativeClusterer(experiment, dissimilarityMeasure, agglomerationMethod);
    final Dendrogram dendrogram = dendrogramBuilder.getDendrogram();
    return dendrogram;