Selection attributes set P is simply parsed from workload.
Let D = {d1, ..., dn} be the set of dimensions tables. The
workload consists of a set of queries W = {q1, ..., qm}.
Let αqi = {ae|ae ∈ qi} be the set of attributes related to
a query qi and βdj = {bf|bf ∈ dj} the set of attributes of a
dimension dj. To make it simple, let consider a query that
involves two dimensions d1 and d2, our strategy consists on
colocating the d1j and d2k blocks on the same or closest
nodes, while remaining dimensions blocks will be placed via
Hadoop’s default strategy over the remaining nodes. For
that, as a first step, parsed dimensions are coded in a query-
dimension matrix QM whose general term QMij equals to
1 if ∃bf in βdj which is also in αqi, and to 0 otherwise.
For example, the QM matrix corresponding to W and D is
featured in Table 1.