where d(u) is the depth of a node in the tree (the root is at
depth 0), and lca(u, v) denotes the lowest common ancestor of
the nodes u and v. We then define the cost of a clustering to be
1 − sim(u, v) for two proteins that are assigned to the same
cluster, and sim(u, v) for two proteins assigned to different
clusters.