IV. SYSTEM DEPLOYMENT AND EXPERIMENT
In order to facilitate system management and further- development, configuration.xml file is used to manage system configuration information. The file contains three tags: , and . is used to specify the Chinese word segmentation in the system. The sub-tag under is used for specifying the designated website which is the started page for information gathering. is used to configure the information of index, such as the sub-tag - is used for defining the fields of the index, we defined five fields- url, title, author, content and indextime. The sub-tag -
is used to specify the directory where the index files is stored. The detail of the configuration.xml file can be seen as follows: