The most important lesson that I learned is the extraction of useful knowledge from aggregated data that is stored separately in multiple different files. The campus crime data provided on the U.S department of education website contains thousands of records with multiple attributes. The aim of this project was to extract this data in a suitable and efficient way and to provide it to the Apriori algorithm in an acceptable format.
The use of XQuery for extracting data from XML files was a very good learning experience for me in understanding of how to query XML files for data searching and modification. The presentation of data in the form of charts needed some data warehousing activity to be performed. This helped me in understanding the data pre-processing activity before constructing a Data mart. It also helped me in understanding of how to resolve the data into different dimension tables and how to select and retrieve the facts which are a part of the fact table.
The implementation of the Apriori algorithm to accept the XML data was another programming concept which I learned while implementing the project. Modularizing the code into a single assembly which can be used later in another projects was one of the important concepts which I learned while implementing the Apriori algorithm. The implementation of a three tier layered application helped me in understanding on how to develop enterprise applications and to
segregate the different layers so as to maintain a level of loose coupling. It made it clear to me that the loose coupling among the data layer, business layer and presentation layer makes it easy
45
for the application developers for maintaining the applications and for future modifications. Due to this, the present web application layer can be replaced with a desktop application layer making it a desktop application without making any changes to the business layer and data layer.