The performance of the three database approaches is evaluated
by using real clinical notes obtained from a GP’s clinic.
The data contain 50,000 consultation records of general family
medicine nature (about 120 MB of text). The types of diseases
include, but not limited to common cold and infiuenza. Additionally,
associated with these records, data about drug intake
and vaccination (when applicable) are also included. The data
are de-identified to remove protected health information. The
clinical notes are stored electronically using a clinical database
with the format shown in Fig. 7(a). A clinical note begins with
the marker [Start], followed by the symptom name (concept),
and the attributes associated with the symptom. The symptom
name and the attributes that follow are separated by
the [Separator] markers. The clinical note ends with the [End]
marker. A sample of the clinical notes represented in this format is shown in Fig. 7(b).