Although parallelizing sparse approximate inverse pre-
conditioners on more than one processor has been exten-
sively studied in previous work which succeeded to enhance
the execution speed of such preconditioners considerably,
few works have studied the possibility of accelerating
these preconditioners on multi/many core architectures.
Gravvanis et al. [14], [15] attempt to accelerate a SAI
preconditioned BiCGStab iterative solver on Intel multicore
architecture by allocating the computation of each iteration
of the iterative solver to a different thread; implementation
details on how to accelerate the preconditioner computation
on a multicore are not presented in this work. Xu et al. [16]
accelerate factorized SAI on NVIDIA GPUs. The paper
mainly describes how to accelerate the sparse matrix vector
multiplication kernel (SpMV) in the iterative solver but
details for computing the SAI preconditioner have not been