Web mining is used to extract information from users’
past behavior. Web structure mining plays an important role
in this approach. Two commonly used algorithms in web
structure mining are HITS and PageRank, which are used
to rank the relevant pages. Both algorithms treat all links
equally when distributing rank scores. Several algorithms
have been developed to improve the performance of these
methods. This paper introduces the WPR algorithm, an extension
to the PageRank algorithm. WPR takes into account
the importance of both the inlinks and the outlinks of the
pages and distributes rank scores based on the popularity
of the pages. Simulation studies using the website of Saint
Thomas University show that WPR is able to identify a
larger number of relevant pages to a given query compared
to standard PageRank.
In the current version of WPR, only the inlinks and outlinks
of the pages in the reference page list are used in the
calculation of the rank scores. In our future study of this
method, we would like to consider the possibility of calculating
the rank scores by using more than one level of reference
page list. Moreover, a detailed analysis of WPR’s performance
using different websites and multiple levels of reference
page lists would be carried out.
As part of our future work, we plan to carry out extensive
performance analysis of WPR by using other web sites
and increasing the number of ‘human’ users to categorize
the web pages