Methods
In this study, we put forward three different strategies for
investigating MeSH in relation to automated classification of web
content belonging to the field of healthcare (labeled health) or outside
of this field (labeled non-health). Fig. 1 provides a general
illustration of the process of developing and evaluating this
classifier.
A collection of web pages that had previously been labeled as
either health or non-health (training database) was used for each
of the three strategies put forward in this study, with the aim of
generating vectors of characteristics for each of the database pages.
The vectors of characteristics were constructed with the aim of
obtaining vector representation of the web pages [32]. Specifically
for the present study, we used three different strategies that, for
presentational purposes, we named as follows: