a) Frequent Pattern Mining
The most significant feature of frequent pattern mmmg
algorithm (such as, FP-Tree algorithm [12]) is to compress
the large database to the compact tree structure (FP-tree) and
quickly mine the set of frequency patterns without the need
of generating the candidate items, since it avoids the repeated
database scanning. It mainly consists of InserCTree
generation algorithm and FP _Growth frequency pattern
mining algorithm. The correlation rules among the attributes
of events can be mined from the set of security alert events
by using FP-Tree algorithm and setting the minimal
supporting threshold value min_sup .
b) Sequential Pattern Mining
WINEPI algorithm [13] is used in sequential pattern
mining to discover the sequential relationship among
security alert events. Firstly, the set of frequent events is
extracted from the set of security alert events with specific
type within the given slide window, and the set of candidate
frequent episode patterns is generated with shorter length.
Secondly, the frequent episode patterns with larger length are
discovered through iteration. Finally, the sequential
relationships among the episode patterns are discovered
based upon the thresholds of frequency and confidence level,
that is, the sequential relationship among the security alert
events.
c) Pattern Analysis and Learning
Through the analysis of frequent patterns and sequential
patterns found in the above mentioned process of knowledge
discovery, we observed that some frequent patterns are just
statistical phenomena, which are meaningless with respect to
the security situation analysis. On the other hand, there are
some security alert events which have few occurrences, the
regularity are illegible, and the confidence levels of the
generated rules are low, whereas these rules are vital to the
correlation of security situation. To utilize the discovered
knowledge effectively, Prolog-EBG machine learning
algorithm are adopted to properly interpret and analyze the
discovered knowledge by introducing the prior knowledge of
the domain, and the revised and optimized rules are exported
from the set of security alert events generated from the attack
simulations. Through this optimization process, the
confidence levels of the rules are properly evaluated from the