Hac - FAQ

These are questions and answers about Hac, our open-source hierarchical agglomerative clustering library.

Is it possible to "cut" the dendrogram or to have the contents of clusters given a certain cluster size?
The Dendrogram is a binary tree. Dendrogram.getRoot() gives you the DendrogramNode that is the root of that tree. Nodes in that tree are either ObservationNodes (leaf node; subclass of DendrogramNode representing a singleton cluster corresponding to a single observation) or MergeNodes (interior node; subclass of DendrogramNode representing a cluster consisting of two sub-clusters). MergeNodes.getDissimilarity() tells you the dissimilarity between a MergeNode's two sub-clusters. That's what allows you to stop the traversal at a given dissimilarity. If you want to stop at a given cluster size, you won't even need that (just count the nodes in the subtree).