3.3. Image Prefilter
Interpolation/smoothing filters are used to reconstruct an image from the frequently sparse point data. There are inherent tradeoffs in the choice of the filter size to use. Small filters are better at producing sharper, higher quality reconstruction when the points are dense, while larger filters are better at filling in the gaps between points when they are sparse.
The original render cache used a single 3x3 weighted image filter as shown in Figure 3. This works well except when no valid point falls within the 3x3 neighborhood of a pixel. In this case the pixel was either left the color it had in the previous frame or cleared to black depending on user preference. However neither choice works very well when the points are too sparse. This often happens when there are large changes in the image from frame to frame or when the rate of samples coming back from the underlying renderer is too low.
To better handle sparse regions we introduce an additional interpolation stage, called the prefilter, with a larger 7x7 uniform filter kernel. Uniform kernels have the advantage that they are cheap to compute and their cost cost does not depend on the size of the kernel (e.g., 5p. 406). Because of its larger kernel though, the prefiltered image is unacceptably blurry in high point density regions.We run the prefilter first and then allow the normal interpolation stage to overwrite any pixels where its smaller filter produces valid data. In effect, the larger prefilter is only used where the smaller 3x3 filter fails. See Figure 4.
The normal interpolation stage with its 3x3 filter also produces the priority image that is used to guide sampling and we have left this unchanged. The use of the prefilter does not effect the priority image of the choice of locations for new samples. Its purpose is simply to reduce the visual artifacts in sparse regions until the point density can be raised to a sufficient level for the 3x3 filter to work. In our implementation the prefilter is actually less expensive than the 3x3 filter and consumes only 10% of the render cache execution time.
3.4. Point Eviction
The render cache uses a fixed size cache of points and in the original version, a point would remain in the cache until it was overwritten by new sample point. Effects such as nondiffuse shading or scene editing can cause a point’s color to become incorrect, or stale. If the rate of new samples being computed per frame is very low, this stale data may remain in the point cache for a long time. We have added a new mechanism to allow points to be evicted from the cache even if there is no point available to overwrite it. Evicting points can actually speed up the image convergence by clearing out stale data more quickly.
Each point has an associated age which is stored in a byte (0-255). At the beginning of each frame, all the existing points are aged by some increment. This increment is chosen based on the number of new points added to the cache such that on average a point should reach the age of 128 before being overwritten. But several conditions can cause points to age at a faster rate such as if the point is not visible in the current frame or if color changes are detected in nearby points in the image plane. These can cause a point to reach the maximum age of 255 at which point it is automatically evicted from the cache. In the future, additional aging penalties may further improve stale data eviction.
3.5. Other Optimizations
We have also rewritten our implementation to take advantage of the SIMD (Single Instruction, Multiple Data) instructions available through Intel’s MMX, SSE, and SSE 2 instruction set extensions1. These provides 8 and 16 byte vectors that can be used to operate on multiple data (e.g., four floats) in a single instruction. We can thus project four points at the same time or operate on the red, green, and blue channels of a pixel simultaneously. However it does require some rear_