handle unorthodox data events can all have drastic
effects on the results. This clustering algorithm is moderately
fast and returns a simple index of integers which
enumerates each observation to its respective cluster.
The biggest downside of k-means clustering algorithm is
the random cluster ordering on the output; however, this
information can be indirectly accessed by looking at the
average distance between clusters (based on the supplied
metric) as well as the number of points in the cluster.
K-means algorithm divides M points in N dimensions
into K clusters so that the within-cluster sum of squares
arg min
Xk
i¼1
X
xj∈Si
jjxj−μijj2; ð6Þ
where μi is the mean of points in Si, is minimized
[49,50]. Here, we have used an implementation of the kmeans
algorithm that minimizes the sum, over all clusters,
of the within-cluster sums of point-to-cluster-centroid
distances. As a measure of distance (minimization parameter),
in our data, we have typically used sum of absolute
differences with each centroid being the component-wise
median of the points in a given cluster.
Neural networks
Artificial neural networks (ANNs) are an entire family of
algorithms, modeled after the neural system found in
the animal kingdom, used to estimate unknown functions
that may have a very large number of inputs. ANNs are
similar to the biological neural system in that they perform
functions collectively and in parallel by the computational
units, as opposed to have each unit a clearly
assigned task. In a mathematical sense, neuron’s function
f(x) can be defined as a mixture of other function
g(x) with weighting factors wi where g(x) is a non-linear
weighted sum of f(x)
f ðxÞ ¼ K
X
i
wigðxÞ
ð7Þ
Here K is commonly referred to as an activation function
that defines the node output based on the set of inputs.
What has attracted people to ANNs is the possibility
of those algorithms to simulate learning. Here by learning
we imply that for a specific task and a class of functions,
there is a set of observations to find that that relates the
solutions of the set of functions. To utilize this concept,
we must imply a cost function C which is a measure of
how far away a particular solution is from the optimal solution.
Consider the problem of finding a model f which
minimizes the cost function
C ¼ E½ðf ðxÞ þ yÞ2 ð8Þ