Thus, it is conceivable that Ipek might also learn to do the same if PwDn/PwUp actions are made available, potentially closing the performance gap with MORSE-P. This is precisely what the Ipek+PwDn/Up configuration in the plots tries to answer. In that configuration, PwDn/PwUp are available actions with an immediate reward of 0 (consistent with the ad hoc reward function employed), and linear feature selection is re-run. Figures 3 and 4 show the performance and expected page status for the Ipek+PwDn/Up configuration. As we can see, the expectation of finding a bank closed drops dramatically to levels similar to those in MORSE-P. When looking at overall performance, however, not much is gained on average, and a significant gap remains, which means that appropriate state attribute and reward values are, in fact, the primary contributors to performance in MORSE-P vs. Ipek.