Distribution of both data and processing
Current systems largely assume that the investigator will be handling a relatively small number of media items.In many investigations this might be only one or two forensic images. This is changing because case volumes are increasing (Justice FBI, 2012). To cope with this, it has been suggested that the next generation of forensic software could adopt a distributed processing model. At this point we should make a distinction between distributed processing with centralised storage and distributed processing working with distributed storage. Having a distributed processing architecture that relies on a central, non-distributed, store of forensic images (Fig. 1) implies that the data has to be distributed to the processing nodes before it can be subjected to processing. This is the case with FTK’s ‘distributed’ processing. Processing time with this topology is dependent on the rapidly becomes overloaded and limits scalability. We can mitigate this to some degree by building a storage facility based on fast SSD storage (450 MB/s), SATA III (600 MB/s) interfaces and even 10 Gb (1000 MB/s) Ethernet networking but this can be prohibitively expensive. Even this has limited capabilities in scaling out to even tens of processing hosts. Assuming we can make this investment it can still take many hours just to read the image off the storage media. If we wanted to conduct simultaneous analysis of several images held on the same storage facility it would have a significant impact on data dispersal time and so overall processing time.