Appendix
Ordinal logistic regression is a procedure aiming to predict the
odds of observing a particular score or less (Agresti, 2002; SPSS Inc.,
2008). In the case of votes based on sensation this could be formulated
as modelling the following odds:
Uj = PðY≤jÞ = PðY N jÞ ð1Þ
where P denotes probability, Y is the response variable and j=1, n−1
where n is number of classes. Class n does not have an odds associated
to it since the range below or equal to this class covers the whole data
set.
The ordinal logistic model for a vector of independent variables
and controlling factors Xi, is then:
ln Uj
= aj + Σ
i
−bi⋅Xi ð2Þ
Larger bi location coefficients indicate an association with higher
votes. A positive coefficient for a dichotomous factor implies that
higher votes are more likely for the first category. A negative coefficient
implies lower votes are more likely. For a continuous variable, a
positive coefficient implies that as the values of the variable increase,
the likelihood of larger votes increases.
Each logit (Uj) has its own threshold, aj but the same location bi for
each parameter of the control vector Xi. That means that the effect of
the independent variable is the same for the different logit functions.
This suggests that the results are a set of parallel lines or planes—one
for each category of the outcome variable, in our case for each vote.
This assumption can be checked by allowing the coefficients to vary,
estimating them, and then testing whether they are all equal.
For a single control (independent) variable, X, if the assumption of
parallel lines is valid, the probability of a response Y being greater or
equal than j, when the independent variable has the value x is:
PðY≥jjX = xÞ = 1− eaj−b⋅x
1 + eaj−b⋅x
ð3Þ
A goodmodel has statistically significant location bi and a favourable
test of parallel lines (large significance level). This is an initial
assessment though. The performance of the model in terms of assigning
the cases to the correct ordinal class has to be assessed in a second step.
To do that we employ the Gamma statistic on the crosstab table among
the original and the modelled classifications. Gamma statistic is a
symmetric measure of association between two ordinal variables that
ranges between−1 and 1.Values close to an absolute value of 1 indicate
a strong relationship between the two variables.