The increasing volumes of relational data let us find an alter-
native to cope with them. The Hadoop framework - an open
source project based on the MapReduce paradigm - is a pop-
ular choice for big data analytics. However, the performance
gained from Hadoop’s features is currently limited by its de-
fault block placement policy, which does not take any data
characteristics into account. Indeed, the efficiency of many
operations can be improved by a careful data placement, in-
cluding indexing, grouping, aggregation and joins. In this
paper we propose a data warehouse distribution strategy
to improve query gain performances on multi-nodes clus-
ters, especially Hadoop clusters. Based on k-means clus-
tering methode that allows to master the number of clusters
through its k parameter, we investigate the performance gain
for OLAP cube construction with and without data organi-
zation. And this, by varying the number of clusters and
data warehouse size. Our experiments suggest that a good
data placement on a cluster during the implementation of
the data warehouse increase significantly the OLAP cube
construction and querying performances.