2.1 Dataset description
The dataset was adapted from the Berlin
SPARQL Benchmark (BSBM) [1]. The BSBM consists
of dataset generators and queries mix that can be
used for comparing the performance of RDF storage
and querying engines. The benchmark was built
around an e-commerce use cases in which a set of
products was offered by different vendors, consumers,
and comments. The benchmark dataset consists of
the following classes: product, product type, product
feature, producer, vendor, review, and person.
The BSBM has been chosen because it can simulate
real-world enterprise application scenarios.
In addition, the BSBM dataset is provided in the RDF
data format, which simulates the Semantic Web data
setting. In our tests, five different sizes of the dataset
were generated and varied by the number of
products: 1360, 2785, and 5544 products.
The numbers of generated triples were 500K, 1M,
and 2M triples respectively.