On 01/20/2013 02:15 PM, Paul Berry wrote: [snip]
As an alternative to those fixes, here's my refactoring suggestion: drop this function entirely, and instead, in brw_blorp_copytexsubimage(), create a temporary intel_renderbuffer to wrap around dst_image, and just call do_blorp_blit() directly. do_blorp_blit() already gets all the resolves correct, and it's well covered by existing tests. As a bonus, when we get around to implementing EXT_multisampled_render_to_texture, we'll probably remember to adjust do_blorp_blit() accordingly--it would be nice for this code to inherit that adjustment automatically.
That's a very good idea. It turned out to be less painful than I might've thought, and cleaned things up quite a bit. v2 implements this (which I think should solve all the resolve questions too).
[snip]
+ /* CopyTexSubImage should happen even in conditional rendering. We could + * turn it off and back on again. For now, just bail instead. + */ + if (ctx->Query.CondRenderQuery) + return false; This shouldn't be necessary. Blorp ignores any 3D pipeline state that the client has set up (including conditional rendering state), so if you just drop this check, the copy should happen just fine even if conditional rendering is on.
You're right, it wasn't necessary. I apparently had another bug originally that showed up in one of the conditional rendering tests, and I just papered over it with this. Turns out I then proceeded to fix the actual bug too, so this is just garbage. Removed in v2.
+ + /* BLORP requires the formats to match. In the future, we should relax + * this for simple format mismatches like XRGB <-> ARGB. + */ + if (_mesa_get_srgb_format_linear(src_rb->Format) != + _mesa_get_srgb_format_linear(dst_image->TexFormat)) + return false; Aren't you and Carl in the midst of relaxing that restriction right now? Perhaps we should refactor this check into a function that's shared by the blit and CopyTexSubImage paths so that once the restriction is relaxed, we don't have to remember to update both checks.
Refactoring it is probably wise. Could we do that as a follow-on patch (in the series that relaxes the restriction)?
[snip]
Destination clipping shouldn't be necessary, because the restrictions on glCopyTexSubImage prevent the user from specifying a destination rectangle that falls outside the bounds of the destination texture. See error_check_subtexture_dimensions().
[snip]
Source clipping shouldn't be necessary either, because copytexsubimage (src/mesa/main/teximage.c) calls _mesa_clip_copytexsubimage, which takes care of it.
Excellent, thanks! I've replaced the code by your comments.
+ + /* Account for the fact that in the system framebuffer, the origin is at + * the lower left. + */ + if (_mesa_is_winsys_fbo(ctx->ReadBuffer)) { + GLint tmp = src_rb->Height - srcY0; + srcY0 = src_rb->Height - srcY1; + srcY1 = tmp; + mirror_y = !mirror_y; + } If you decide to go along with my refactoring suggestion, I'd suggest a follow-up patch where we move this check (and similar checks in try_blorp_blit) inside do_blorp_blit, just so that we don't have to duplicate this code.
Apparently that's hard: try_blorp_blit and this function look at ctx->ReadBuffer/ctx->DrawBuffer, which are gl_framebuffer*s. do_blorp_blit() only looks at the src/dst intel_renderbuffers, and doesn't access ctx fields. I'd like to keep that separation, if possible.
Unfortunately, it apparently isn't possible to determine whether to Y-flip based on the intel_renderbuffers alone. Name == 0 seemed obvious, but some of our fake/wrapper RBs apparently use that. (We could change that, but I'm afraid I won't catch all cases.) I also thought to check AllocStorage == intel_alloc_window_storage, but Chad convinced me that would fail in certain corner cases as well.
So I think I'll punt on this refactoring, if possible. (I'd still like to see it happen someday.)
[snip]
Possible further improvements (I'm not asking you to do them--just pointing them out so we can keep them in mind as we look for other places we can improve performance): (1) In theory, we ought to be able to optimize do_blorp_copytexsubimage and do_blorp_blit to detect the case where the destination is being entirely overwritten, and avoid having to do a depth resolve on the destination in that case (there's no point in carefully computing values to write into a depth buffer that's about to be overwritten). (Note: a further advantage of my refactoring suggestion is that we'd only have to make this improvement in one place). (2) It's unfortunate that we still fall back to swrast when dims == 3. Really we ought to be able to do handle the 3D case in both blorp and the hardware blitter--we just have to calculate the miptree layer more carefully. Your code uses dst_image->Face, but instead it should use a combination of dst_image->Face and zoffset, in much the same way that intel_render_texture() does (in fact, it would be nice if we had some common "intel_miptree_compute_layer" function for this purpose).
Agreed. It'd be nice to fix this sometime.
(3) I have a long-term plan to update blorp so that its access to the destination miptree is HiZ-aware (blorp uses the 3D pipeline after all, so this should be possible). That will of course change what resolves are necessary. I have no idea when we'll get around to doing this, but it's worth looking into if we continue to find blit-related performance bottlenecks like this one. Thanks again for fixing this, Ken. Hopefully my suggestions aren't too onerous.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev