Document classification: For document classification we propose the use of k-Nearest Neighbour classification (kNN). This is a simple method. If you insert a test document into the system, the system finds the knearest neighbours among the profile documents. It uses the categories of the k neighbours to weight the category of the test document. Referring to the
examination of YANG and LIU, the kNN (using the cosine similarity on document vectors) is one of the best methods for text categorization [Yang 1999]. This paper describes our work which is still in progress. At present, we do not have enough qualified data to run optimizations on parameters of the kNN. Thus, the kNN is future work and will not be discussed in this
paper in more detail