The main motivation was to determine efficient ways of querying data from multiple XML
documents. It is based on the “Standard Data Source Template (SDST) concept” as proposed
in the IEEE paper “Mining Association rules from complex and irregular XML Documents
using XSLT and XQuery” by Xinwei Wang and Chunjing Cao [2]. The campus crime data is
stored in different files based on the location of the crimes and the year it occurred. The XML
document structure can be very complex having several attributes, which makes it difficult to
generate consolidated results. Powerful query languages like XQuery helps in querying these
multiple documents and generate a single simple output in the XML format. The XML result
can be efficiently used as a standard input structure to the Apriori algorithm [2]
implementation making it generic for any type of future data modifications. The main
encouraging factor in developing this tool lies in the understanding of developing a system
capable of querying data from XML (Extensible markup language) which has become a
standard for data exchange over the web. The use of XQuery and approach based on SDST
pattern in the implementation of data mining algorithms [3] to extract useful rules and
patterns for data analysis is a motivating factor for this project.