Performance of Extracting Correlations
In Figure 11 we compare the time needed to extract correlation intervals using the same setup as in our accuracy evaluation.
In the top chart, we report average times for the proposed methods using sliding and fixed time intervals (left), and the top-k technique (right). The time needed to compute correlations using sliding time intervals is approximately one third larger than the time taken by a fixed-interval method, since the prior needs to incrementally compute and compare correlations for two intervals: one is a fixed-length interval, sliding in front of the cursor, and another one is a dynamically expanding interval behind the cursor. The time of base line methods remains fairly large since they compute correlations for a set proportional to|L×L|.
In the bottom chart, we demonstrate the effect of hierarchical pruning for fixed-interval correlations and compare it to the grid hashing correlation pruning used by Stat Stream [21]. We note that Stat Stream cannot be applied for sliding-interval correlations when sliding intervals among time series are of different lengths, as in our method. Moreover, the kind of time series approximation used in Stat Stream cannot be used for bounded sentiments, where it results in numerous false positives at shorter interval lengths.
Both methods lead to significantly improved execution times in comparison to base line methods, with hierarchical exhibiting better relative performance. The advantage of our method on highly correlated data is more pronounced with lower correlation thresholds, when maximality constraints allow earlier pruning. On the other hand, Stat Stream’s performance benefits from better selectivity of higher thresholds and when time series are sparsely correlated, making it a good complementary approach.
Overall, we observe that the best performance is achieved by top-k correlations, which we report in both charts (with and without hierarchical pruning) for varying top-k sizes (16, 8 and 4 disk pages). It is evident, that top-k method is much faster even in the case of larger top-k sizes, where it matches the accuracy of direct methods. Furthermore, the top-k method demonstrates sub-linear scalability, sustaining almost the same performance even with exponentially increasing top-k sizes.
Performance of Extracting Correlations In Figure 11 we compare the time needed to extract correlation intervals using the same setup as in our accuracy evaluation.In the top chart, we report average times for the proposed methods using sliding and fixed time intervals (left), and the top-k technique (right). The time needed to compute correlations using sliding time intervals is approximately one third larger than the time taken by a fixed-interval method, since the prior needs to incrementally compute and compare correlations for two intervals: one is a fixed-length interval, sliding in front of the cursor, and another one is a dynamically expanding interval behind the cursor. The time of base line methods remains fairly large since they compute correlations for a set proportional to|L×L|. In the bottom chart, we demonstrate the effect of hierarchical pruning for fixed-interval correlations and compare it to the grid hashing correlation pruning used by Stat Stream [21]. We note that Stat Stream cannot be applied for sliding-interval correlations when sliding intervals among time series are of different lengths, as in our method. Moreover, the kind of time series approximation used in Stat Stream cannot be used for bounded sentiments, where it results in numerous false positives at shorter interval lengths. Both methods lead to significantly improved execution times in comparison to base line methods, with hierarchical exhibiting better relative performance. The advantage of our method on highly correlated data is more pronounced with lower correlation thresholds, when maximality constraints allow earlier pruning. On the other hand, Stat Stream’s performance benefits from better selectivity of higher thresholds and when time series are sparsely correlated, making it a good complementary approach. Overall, we observe that the best performance is achieved by top-k correlations, which we report in both charts (with and without hierarchical pruning) for varying top-k sizes (16, 8 and 4 disk pages). It is evident, that top-k method is much faster even in the case of larger top-k sizes, where it matches the accuracy of direct methods. Furthermore, the top-k method demonstrates sub-linear scalability, sustaining almost the same performance even with exponentially increasing top-k sizes.
การแปล กรุณารอสักครู่..