The fetch stage is simply a memory read with the PC on our address bus. It gives a cycle of latency to allow for our instruction to appear on the data out bus of the RAM, ready for decoding. When we encounter a memory ALU operation, we need the control unit to activate the memory stage of the pipeline, which sits after Execute and before Writeback. The way we want this implemented is that the ALU calculates the memory address during execute, and that address is read during the memory stage, and the data passed to the register file during writeback. For a memory write, the ALU calculates the address, and the data we want to write is always on the dataB bus output from the register file, so we connect that up to the memory input bus.