We present a method of predicting the distribution of passenger throughput across stations
and lines of a city rapid transit system by calculating the normalized betweenness centrality
of the nodes (stations) and edges of the rail network. The method is evaluated by correlating
the distribution of betweenness centrality against throughput distribution which is calculated
using actual passenger ridership data. Our ticketing data is from the rail transport system of
Singapore that comprises more than 14 million journeys over a span of one week. We demonstrate
that removal of outliers representing about 10% of the stations produces a statistically
significant correlation above 0.7. Interestingly, these outliers coincide with stations that opened
Despite the fact that not all the passengers use shortest paths to travel between stations and
that the distribution of passengers traveling between various origin-destination pairs is not uniformly
distributed, we demonstrate that betweenness centrality can still predict to a reasonable
accuracy, the level of utilization between different portions of the network. This is shown by
the high level correlation between betweenness centrality and passenger throughput when most of the stations and edges of the network are being considered.
In addition, we found that the outliers of the correlation correspond exactly to the stations
and edges of a new section of the network. This suggests that although the passengers generally
conform to the structure of the network when utilizing the transportation system, this conformity
is reduced when dynamic changes are made to the structure. There may therefore exist
some lag time before passengers adapt to the new structure and adopt more efficient routes
which did not exist previously. Confirmation of this hypothesis would however, require a comparative
analysis with the current data against a different set of ridership data which include
periods before and after the addition of these lines.
We have also shown that correlation improves significantly when distinct lines are correlated
separately. The different lines exhibit variation in slopes, indicating that the unique structural
characteristics of lines impacts the pattern of ridership within the network. This is expected
since the distribution of passenger ridership gradually evolves with the structure by adjusting to
the availability of routes and convenience to individual passengers. With increasing complexity
planned to be added in the network in the future, we hope that this work will serve as a
standard methodology in capturing some base line information on the expected utility of a
specific segment of the rapid transit system.