Formally, given a keyword query K={k1, k2, ..., kn}
and an SI-Tree, the structural similarity between SI-Tree
and K mainly captures the compactness of SI-Tree.
Then, a more compact SI-Tree is more likely related to
K. Moreover, the compactness should consider: (1) the
compactness-based distance, i.e., the distance between
any node and the root of SI-Tree; (2) the phrase-based
distance, i.e., the distance between any two keywords;
and (3) the size of SI-Tree, i.e., the number of nodes
involved in SI-Tree. The compactness-based distance
represents the compactness of SI-Tree. The phrase-
based distance represents the relationships between
two keywords and weights the keywords with smaller
distances, especially keywords in the same node rep-
resenting a phrase. The size of SI-Tree is used to nor-
malize the overall structural similarity, since more
nodes in SI-Tree means that it more likely contains
more keywords. In addition, a larger compact-
ness-based distance means that SI-Tree is less compact,
and therefore less likely to be related to K. Then,