algorithm are stored in global memory (Q in Table 1) and used in proceeding steps. At the end of the Compute-GSAI stage,mkvaluesarecomputedviam^k1⁄4R 1QTe^k(Risthe upper triangular matrix from the decomposition [1]) and scattered to global memory space allocated to M.