Problem: In irregular pattern, the size of a problem
changes dynamically during a process, making a
challenge for CUDA computing model.
Solution:
1) Estimate the upper bound of threads/thread
blocks, then allocate the GPU resources.
2) Let the #threads/#blocks quit immediately if it is
determined useless when the processing begins.