Complex diseases usually involve complex interactions between multiple loci. The artificial intelligent
algorithm is a plausible strategy to evade combinatorial explosion. However, the randomness of solution
of this algorithm loses decreases the confidence of biological researchers on this algorithm. Meanwhile,
the lack of an efficient and effective measure to profile the distribution of cases and controls impedes the
discovery of pathogenic epistasis. Here we present an efficient method called maximum dissimilarity–
minimum entropy (MDME) to analyze breast cancer single-nucleotide polymorphism (SNP) data.
The method searches risky barcodes, which to increase the odds ratio and relative risk of the breast
cancer. This method based on the hypothesis that if a specific barcode is associated with a disease, then
the barcode permits distinction of cases from controls and more importantly it shows a relative
consistent pattern in cases. An analysis based on simulated dataset explains the necessity of minimum
entropy. Experimental results show that our method can find the most risky barcode that contributes to
breast cancer susceptibility. Our method may also mine several pathogenic barcodes that condition the
different subtypes of cancer.