Recent years have seen many natural-language processing
(NLP) projects aimed at producing grammars/
parsers capable of assigning reasonable syntactic
structure to a broad swath of English. Naturally,
judging the creations of your parser requires a “gold
standard,” and NLP researchers have been fortunate
to have several corpora of hand-parsed sentences for
this purpose, of which the so-called “Penn tree-bank”
[7] is perhaps the best known. It is also the corpus
used in this study. (In particular, we used the Wall
Street Journal portion of the tree bank which consists
of about one million words of hand-parsed sentences.)
However, when a convenient standard exists, the research