uss some details of our measurement
software and the collected data. Our measurement
software consists of two parts with three scripts
each. The first part is used for monitoring the global BitTorrent/Suprnova
components, and consists of the Mirror
script which measures the availability and response time
of the Suprnova mirrors, the HTML script which gathers
and parses the HTML pages of the Suprnova mirrors
and downloads all new .torrent files, and the Tracker
script which parses the .torrent files for new trackers
and checks the status of all trackers.
The second part of our software is used for monitoring
actual peers. To follow thousands of peers at one minute
time resolution we used 100 nodes of our Distributed
ASCI Supercomputer (DAS, cs.vu.nl/das2). The
Hunt script selects a file to follow and initiates a measurement
of all the peers downloading this particular file,
the Getpeer script contacts the tracker for a given file and
gathers the IP addresses of peers downloading the file,
and the Peerping script contacts numerous peers in parallel
and (ab)uses the BitTorrent protocol to measure their
download progress and uptime. The Hunt script monitors
once per minute every active Suprnova mirror for the
release of new files. Once a file is selected for measurement,
the Getpeer and Peerping scripts are also activated
at the same time resolution. In this way we are able to obtain
the IP addresses of the peers that inject new content
and we can get a good estimate of the average download
speed of individual peers.
In doing our measurements, we experienced three
problems. First, our measurements were hindered by the
wide-spread usage of firewalls [11]. When a peer is behind
a firewall, our Getpeer script can obtain its IP number,
but the Peerping script cannot send any message to
it. Therefore, our results for download speed are only
valid for non-firewalled peers. The second problem was
our inability to obtain all peer IP numbers from a tracker
directly. The BitTorrent protocol specifies that a tracker
returns only a limited number (with a default of 20) of
randomly selected peer IP numbers. We define the peer
coverage as the fraction of all peers that we actually discovered.
In all our measurements we obtained a peer coverage
of over 95 %. Our final measurement problem was
caused by modifications made to the BitTorrent system
itself. Which created minor gaps in our traces.