V. APPLICATION OF TIME-SERIES DATA MINING
A. Datasets
The EZ-Link card is a contactless smart card used mainly
for the payment of public transportation fares in Singapore.
For this study, we were able to obtain one month (November
2011) worth of EZ-Link smart card transaction data from the
LTA. An estimated total of 60 million train transit trip
transactions were made in the month of November 2011. Each
trip transaction consists of quite a number of data columns,
which describe a train transit trip. However for the purpose of
this study, we are only interested in the following data
columns: the origin station, destination station and passenger
entry timestamp into the origin station.
B. Data Transformation
While the time factor of the entry timestamp for each trip
transaction remains critical for our analysis, the absolute time
value was not ―analytical friendly‖ for performing time-series
data mining. As such, the transaction data with entry
timestamp will need to be transformed into origin-destination
(O-D) time interval format for time-series data mining. A Java
application was written to perform this data transformation.