developed the Earth system [18] (no relation to the
Japanese Earth Simulator) which was a software
implementation of a multithreaded execution model.
Culler at Berkeley developed the treaded abstract
machine or TAM [19] for conventional multipleprocessor systems. There are many other examples of
multithreading including the MIND PIM architecture
under development by Sterling at Caltech.
Cellular structures that employ fine grain
devices interconnected to nearest neighbors have
been employed for more than two decades in support
of a number of paradigms [20]. Two of the most
widely used are cellular automata invented by von
Neumann [21] and systolic arrays invented by H.T.
Kung [22] now at Harvard. Simple pipeline structures
are trivial cases of this but support important
computations such as gene mapping or string
matching.
CCA can be viewed as an extreme extension of
processor in memory taken to its limits. Under the
DOE Office of Science Advanced Programming
Models project led by Argonne National
Laboratories, Sterling has explored a message-driven
multithreaded futures-synchronized advanced
programming model called “ParalleX” and is
developing a intrinsic latency tolerant parallel
programming language called “Agincourt”. ParalleX
will provide the execution model for CCA and
Agincourt will be suitable as a high level language
for eventual programming of CCA systems. A project
that has just started under the new DOE-sponsored
Fast-OS program is developing an ultra lightweight
kernel with services that are distributed among many
fine grain computing elements such as multi-core
processor chips, processor in memory chips, or sub
arrays of CCA cells. Finally, a small two year project
funded by DARPA starting in 1997 created the
original concepts upon which CCA is based.