Mahalanobis Distance is similar to Minimum Distance, except that the covariance matrix is used in the equation. Variance and covariance are figured in so that clusters that are highly varied lead to similarly varied classes, and vice versa. For example, when classifying urban areas—typically a class whose pixels vary widely—correctly classified pixels may be farther from the mean than those of a class for water, which is usually not a highly varied class ( Swain and Davis, 1978).
Mahalanobis Distance algorithm assumes that the histograms of the bands have normal distributions. If this is not the case, you may have better results using Parallelepiped or Minimum Distance decision rule, or by performing a first-pass parallelepiped classification.