6.2. Pthreads Support
One benefit of the port to the Linux 3.7 kernel and the newer release of PetaLinux
(v12.12) is the addition of pthreads support in glib that was not present in earlier
versions. As the pthreads library makes use of the LWX and SWX instructions, and we
have altered the semantics of these instructions to be a superset of their original behavior,
testing pthreads support in the system is another way to verify the implementation
of our conditional load/store operation.
As we have already investigated the stand-alone bandwidth of the system, an interesting
extension of that is to see what impact the OS has on achievable bandwidth.
As such, we have set up a pthreads-based multi-threaded application with the same
structure as our stand-alone test using a barrier to synchronize the threads. Since the
OS runs in DDR and not BRAM, we have rerun the earlier bandwidth test in DDR
as well to allow for a more direct comparison, the results of which are presented in
Figure 9. In this test, the results for systems B and C are combined as there was no
appreciable difference between the two configurations. We can immediately note that
the maximum achievable application bandwidth has dropped significantly and is no
longer saturated, even with eight cores. Single-core bandwidth has been cut in half
and the bandwidth increases at a lower rate than when the application is run from
BRAM. Comparing the results of the tests run in a stand-alone environment versus
running with an OS, we see a further reduction in the bandwidth achievable when running
with the OS. While we expect some additional overhead while running under an
OS, we expect the impact is magnified here as there are no caches in the system. In
future work, we would like to measure the impact again with a system with level one
caches to see if the overhead of the OS remains as high.