We will be talking a lot about distances in this book. The concept of distance between two
samples or between two variables is fundamental in multivariate analysis – almost
everything we do has a relation with this measure. If we talk about a single variable we
take this concept for granted. If one sample has a pH of 6.1 and another a pH of 7.5, the
distance between them is 1.4: but we would usually call this the absolute difference. But on
the pH line, the values 6.1 and 7.5 are at a distance apart of 1.4 units, and this is how we
want to start thinking about data: points on a line, points in a plane, … even points in a ten-dimensional
space! So, given samples with not one measurement on them but several, how
do we define distance between them. There are a multitude of answers to this question, and
we devote three chapters to this topic. In the present chapter we consider what are called
Euclidean distances, which coincide with our most basic physical idea of distance, but
generalized to multidimensional points.