Hac is a simple library for hierarchical agglomerative clustering. The goal of Hac is to be easy to use in any context that might require a hierarchical agglomerative clustering approach. You can use Hac by bundling Hac with your application, and by implementing two interfaces: Experiment
(to tell Hac what to cluster) and DissimilarityMeasure
(to tell Hac how to compute the dissimilarity between two observations).
We use Hac in Trevis, our context tree visualization and analysis framework, to cluster calling context trees. We also use Hac to cluster solutions to classroom problems in Informa, our classroom clicker system.
You can use Hac as follows:
Experiment experiment = pickExperiment(); DissimilarityMeasure dissimilarityMeasure = pickDissimilarityMeasure(); AgglomerationMethod agglomerationMethod = pickAgglomerationMethod(); DendrogramBuilder dendrogramBuilder = new DendrogramBuilder(experiment.getNumberOfObservations()); HierarchicalAgglomerativeClusterer clusterer = new HierarchicalAgglomerativeClusterer(experiment, dissimilarityMeasure, agglomerationMethod); clusterer.cluster(dendrogramBuilder); Dendrogram dendrogram = dendrogramBuilder.getDendrogram();
You provide implementations of the Experiment
interface. An experiment consists of multiple observations. The goal of Hac is to cluster those observations.
public interface Experiment { public int getNumberOfObservations(); }
You also provide a corresponding implementation of a DissimilarityMeasure
. The dissimilarity measure computes the dissimilarity between two observations in an experiment.
public interface DissimilarityMeasure { public double computeDissimilarity(Experiment experiment, int observation1, int observation2); }
Hac provides different implementations of AgglomerationMethods
:
Hac is available on GitHub at http://github.com/sape/hac.
Questions? Check our our Hac FAQ.