Here we propose an adaption of Wilcoxon’s two-sample rank-sum test to interval data. This adaption is interval-valued: it computes the minimum and maximum values of the statistic when we rank the set of all feasible samples (all joint samples compatible with the initial set-valued information). We prove that these bounds can be explicitly computed using a very low computational cost algorithm. Interpreting this generalized test is straightforward: if the obtained interval-valued p-value is on one side of the significance level, we will be able to make a decision (reject/no reject). Otherwise, we will conclude that our information is too vague to lead to a clear decision.
Our method is also applicable to quantized