Usage Statistics and Top 20 Most Accessed Documents
In-house usage analyzing programs were developed to supplement the usage reports that come with DSpace. They are based on the web access logs captured by the scrver the PDF files. The when users issue requests to download the bitstreams, e Repository is open to the world and allows visits from search engines, robots and OAI harvesters. As a result, it receives tens ofthousands of web requests per day. Program was developed to enable the Library to know how many times the IR documents were downloaded by "real users, excluding robot accesses. This figure was updated monthly to the Repository home page Another customized program was the monthly listings of the Top 20 most accessed documents. It is interesting to analyze these Top 20 lists as they give a good account of documents, topics and authors that users are most interested in. Such information is useful for IR promotion. For example, the Library wrote to the authors in the lists to inform em about the high usage of their papers. The IR Team also showed the lists to the faculty members during departmental visits, While the majority of the documents are from the academic departments, it is worth mentioning that a number of documents authored by the HKUST Language Center made their way into the lists together with the ones HKUST Library wrote on institutional repository and virtual reference, CJA Search and Display In the carly versions of DSpace, t were problems on searching and displaying Chinese characters. The authors managed to fix these problems by revising and replacing some of the DSpace source codes. While some of these problems were eventually fixed in DSpace's later versions, the timing of fixing them was critical to the Library's IR software selection. ad they not been fixed during software evaluation the Library would not have selected DSpace. Thanks to open source, one could dig into the source codes and fix problems quickly. The main CJK problem was attributed to the use of the CJK-illegible string tokenizer. DSpace is Unicode capable, meaning that it supports data and strings in multiple scripts, including CJK. However, like many other non-Roman scripts, the Chinesc strings are sorted, indexcd and scarched can be quite different from that for English. Global software developers should be aware of these differences in order to avoid problems similar to the ones encountered with DSpace. Enhancing Global Access It is essential to publicize an institutional repository so that the research output can be