Re-sampling is a preprocessing technique which adjusting the distribution of an imbalanced
dataset until it is nearly balanced, before feeding it into any classifiers. The
simplest re-sampling techniques are a random over-sampling technique [14] and a
random under-sampling technique [14]. The former randomly duplicates positive
instances into a minority class, while the latter randomly removes negative instances
from a majority class. Both techniques are sampling the dataset until the classes are
approximately equally represented. However, the random over-sampling technique
may cause the overfitting problem [19] because the technique may create the decision
regions smaller and more specific. The random under-sampling technique encounters
the problem that diminishing some important information of a dataset. For handling
these problems, improved re-sampling techniques were studied and are described as
follows.