We present a new technique for clustering these large, highdimensional datasets. The key idea involves using a cheap, approximate distance measure to efficiently divide the data into overlapping subsets we call canopies.
We present a new technique for clustering these large, highdimensionaldatasets. The key idea involves using a cheap,approximate distance measure to efficiently divide the datainto overlapping subsets we call canopies.