On 17.07.2013 02:05, Marek Olšák wrote:
No, it's not faster, but it's not slower either.
Now that I think about it, I can't come up with a good shader-based
algorithm for the resolve operation.
I don't think Christoph's approach that an MSAA texture can be viewed
as a larger single-sample texture is correct, because the physical
locations of the samples in memory usually do not correspond to the
sample locations the 3D engine used for rasterization. so fetching a
texel from the larger texture at (x,y) physical coordinates won't
always return the closest rasterized sample at those coordinates. Also
the bilinear filter would be horrible in this case, because it only
takes 4 samples per pixel.
Now let's consider implementing the scaled resolve operation in the
shader by texelFetch-ing all samples and using a bilinear filter. For
Nx MSAA, there would be N*4 texel fetches per pixel; in comparison,
separate resolve+blit needs only N+4 texel fetches per pixel. In
addition to that, the resolve is a special fixed-function blending
operation and the fragment shader is not even executed. See? Separate
resolve+blit beats everything.
AFAICS the point of the spec is that it allows cheaper approximations
that don't use all texels and it allows the implementation to avoid
writes to a temp texture, both to save memory bandwidth. I am not sure
if it is reasonably possible to do this (without causing aliasing). How
does scaled blit on Intel hardware perform compared to resolve+blit?
Maybe it helps on bandwidth-constrained GPU configurations.
In terms of memory bandwidth per pixel, resolve+blit needs N reads and 1
write for the resolve step and 1 read for the blit step. If we assume
100% hit rate for the texture cache, scaled blit needs only N reads and
that's it. So in theory it may work. OTOH, compressed colorbuffers and
fast clear that are used by r600g should reduce actual bandwidth
requirements for the resolve step a lot. And we cannot take advantage of
the compression when we're sampling from colorbuffers. I probably just
answered this myself: resolve+blit is easier and better at least on
Radeon hardware. :)
Grigori
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev