As we can see from the introduction of the DSM, the cell
cache is the key component of IMs and OMs. It should have
the ability of storing n cells, distributing n cells to the memory
units and sending out some extra cells in addition to
these n cells in the same time slot. Meanwhile, it should also
be scalable. Fig. 6 shows a possible design of the cell cache
as a 2n n 2-dilated buffered crossbar which stores only
Fig. 4. The input module design as the distributed shared-memory. Fig. 5. The output module design as the distributed shared-memory.
XIA ET AL.: A PRACTICAL LARGE-CAPACITY THREE-STAGE BUFFERED CLOS-NETWORK SWITCH ARCHITECTURE 321
one cell at each crosspoint. The Difference between the traditional
buffered crossbar and the 2-dilated variant is that
the latter has doubled channels internally. According to
Theorem 1, we know that there will be at most n2 cells in
the cache; however, the buffered crossbar is 2n n, which
means that we can always find n eligible rows in the buffered
crossbar, each having at least one empty crosspoint
buffer (CPB). Thus, at most n arriving cells, each from an
input port, can be accommodated in the buffered crossbar if
we switch each cell to an eligible row before the buffered
crossbar. For the cell distribution, there are totally n cells to
be distributed at any time slot. We always use the n southbound
output ports to distribute cells. The n northbound
output ports are used to send out extra cells in conflict with
the normal cell distribution. For example, in OMs, the HoL
cells may still be in the cell cache as we just discussed. As
we can see, each crosspoint buffer in the buffered crossbar
is written only once and read only once during each time
slot, which requires no speedup for the buffers. To output
the selected cells, if we assign each of the cells to an arbitrary
output port, then there will be some cells mutually
crossing each other as illustrated in Fig. 7a, leading to con-
flicts of the internal links in the buffered crossbar. To make
sure that the cells do not mutually cross each other, we have
to assign the output ports to the crosspoint buffers in the
ascending order of their column numbers as shown in
Fig. 7b, which can be done with the help of a sorting network.
After that, the cells are switched to their desired output
ports.