Inspired by the example of that great Cornellian, Vladimir Nabokov, some of your friends have become amateur lepidopterists (they study butterflies). Often when they return from a trip with speciments of butterflies, it is very difficult for them to tell how many distinct species they have caught – thanks to the fact that many species look very similar to one another.
One day they return with n butterflies, and they believe that each belongs to one of
two diffent species, which we will call A and B for purposes of this discussion. They
would like to divide the n speciments into two groups – those that belong to A and
those that belong to B – but it is very hard for them to directly label any one specimen.
So they secide to adopt the following approach.
For each pair of speciments i and j, they study them carefully side by side. If they
are confident enough in their judgment, then they label the pair (i; j) either “same”
(meaing they believe them both to come from the same species) or “different” (meaning
they believe them to come from different species). They also have the option of
rendering no judgment on a given pair, in which case we will call the pair ambiguous
So now they have the collection of n speciments, as well as a collection of m judgments
(either “same” or “different”) for the pairs that were not declared to be ambiguous.
They would like to know if this data is consistent with the idea that each butterfly
is from one of species A or B. So more concretely, we will declare the m judgmenta
to be consistent if it is possible to label each specimen either A or B in such a way
that for each pair (i; j) labled “same”, it is the case that i and j have the same label;
and for each pair (i; j) labeled “different”, it is the case that i and j have different
labels. They are in the middle of tediously working out whether their judgments are
consistent, when one of them realizes that you probably have an algorithm that would
answer this question right away.
Give an algorithm with running time O(m + n) that determines whether the m judgments
are consistent.