and we must be able to reason on it. One of the most relevant
works which tackle this problem is [21]. Their work has led
to a tool termed WebPIE (Web-scale Inference Engine). In [21], inference
rules are rewritten and map and reduce functions are
specified for each of them. This work has inspired the work
of [22] who propose a MapReduce-based algorithm for classifying
EL+ ontologies. Another relevant work in this challenge
focuses on efficient RDF repositories partitioning and scalability
of SPARQL queries [85]. We can also add [86] which proposes
a way to store and retrieve large RDF graphs efficiently.
Concerning the (complete) description of entities in the middle
of billion RDF/RDFS triple mentioned in the third challenge,
[38] designed a Semantic Web Search Engine (SWSE)
which has many features including entities description. Here,
this description is obtained by aggregating efficiently descriptions
from many sources.
If we know how to infer over billion RDF-triples, it is not
easy to deal with noise, inconsistency and various errors
found in RDF datasets. [87] identify four sources of errors:
(i) accessibility and dereferenceability of URIs, (ii) syntax errors,
(iii) noise and inconsistency (e.g: use of undefined classes of
properties, misuse of a class as a property and vice versa, etc.)
and (iv) ontology hijacking. [88] propose to repair or to be able to
infer in such a noisy context. For repairing, they identify the
“minimal inconsistent subset” (MIS) of the ontology and the
subsets the MIS will affect. For reasoning, [88] leverage the pioneering
work of [89] and propose to answer queries based on
consistent subsets (which grows inclusively) of the given ontology.
The choice of the subsets are based on syntactic and
semantic heuristics. In the same paper, uncertainty in reasoning
is handled by adding confidence value to the elements
of the ontology.