Online learning to rank for information retrieval holds promise for allowing the development of “self-learning” search engines that can automatically adjust to their users. With the large amount of e.g.,click data that can be collected in web search settings, such techniques could enable highly scalable ranking optimization. However,feedback obtained from user interactions is noisy, and developing approaches that can learn from this feedback quickly and reliably is a major challenge.