1. Pre-GSAI: involves reading A in a compressed sparse format [39] and transferring it to GPU, allocating GPU memory space to the preconditioner M and other data structures and determining the number of kernel calls based on the available global memory.