the categorical variables were re-coded in binary variables as this did not bring along an important loss of information, that is, some rare categories would not positively contribute to an analysis of the main patterns present in the data. Finally, we attributed the label ‘missing’ for the cargo variable instead of a value of zero to ports, which do not handle any cargo at all. Given the limited number of ports with only passenger traffic, we do not expect that this affects the results in a major way