We propose a methodology for retrieving infographics in response to a user query. Our approach analyzes the user query to hypothesize the desired content of the independent and dependent axes of relevant infographics and the high-level message that a relevant infographic should convey. It then ranks candidate graphics using a mixture model that takes into account the textual content of the graphic, the relevance of its axes to the structural content requested in the user query, and the relevance of the graphic's intended message to the information need (such as a comparison) identified from the user's query. We currently focus on static simple bar charts and line graphs.
Given a query, our retrieval methodology first analyzes the query to identify the requisite characteristics of infographics that will best satisfy the user's information need. Then the infographics in our digital library are rank-ordered According to how well they satisfy this information need as hypothesized from the user's query.