III. RELATED RESEARCH
The three case studies described here are the best match and very close to my research. The purpose is to investigate how the semantic technologies were exploited and what are the techniques used to enhance the functionality of digital library.
The study by Castells et al.[7] proposed a model for the exploitation of ontology-based knowledge bases to improve search over large document repositories. For this purpose, an ontology based scheme for the semi automatic annotation of documents and a retrieval system is approached. The retrieval model is based on an adaptation of classic vector space model including an annotation weighting and a ranking algorithm. The knowledge base (KB) is constructed from three main base classes which are Domain Concept, Topic and Document. The predefined base ontology classes are complimented with an annotation ontology that provides the basis for semantic indexing of documents with none embedded annotation. Documents are annotated with concept instances from the KB by creating instances of the annotation class provide for this purpose. The annotations are used by the retrieval and ranking module. The ranking algorithm is based on an adaptation of the classic vector space model.
The study by Mustafa and Khan[8] proposed a semantic information retrieval framework to improve the precision of search results. The thematic similarity approach is employed for information retrieval in order to capture the context of particular concepts. User'ร queries in the existing metadata are searched by matching RDF triples instead of keywords. The results of the experiments performed on their framework showed improvements in precision and recall compared to the existing semantic-based information retrieval techniques. The proposed framework of Semantic Information Retrieval Framework has the components which are Crawler, Source Model, Semantic Matcher, Query Reformulator and Ranker. The crawler extracts metadata in the form of RDF triples from documents residing in the documents repository and loads them into the source model. The Crawler keeps on updating the information about the documents to maintain the source model updated. The Source Model maintains metadata information about digital documents. The Semantic Matcher is used to perform RDF triples matching. In the Semantic matcher different rule-bases are created to deduce inference from existing RDF data. A rule is an object that can be applied to deduce inferences from RDF data. The Query Reformulator expand RDF query with synonym, semantic neighborhood and other relationships such as hyponym (i.e. Is-A relationship) and Meronym (i.e. Part-of) using distance based approach . Then the query is rewritten for these expanded terms to pass it to the Semantic Matcher in the form of RDF triples. The ranker is used to sort the documents according to their relevance to the user's queries.
The research by Frosterus and Hyovonen [9] proposed a hybrid document search approach combining the benefits of the traditional text search of literal documents and the semantic search based on their underlying conceptual structures is presented. The approach is based on document expansion, where documents are automatically annotated with not only the concepts explicitly present in a given document, but also with the ontologically related concepts using smaller weights. In order to facilitate semantic search, each lemmatized term in a document is matched with ontological concepts using labels present in an ontology. If a match is found, then the concept'ร URI is added to the document'ร metadata as a subject annotation. In this case study, the hybrid search architecture and process used, based on text search and semantic search using ontology based document expansion.