On Fri, Oct 14, 2016 at 12:46:31PM -0700, Jason Ekstrand wrote: > On Wed, Oct 12, 2016 at 9:01 AM, Nanley Chery <nanleych...@gmail.com> wrote: > > > On Tue, Oct 11, 2016 at 06:55:53PM -0700, Jason Ekstrand wrote: > > > On Tue, Oct 11, 2016 at 6:16 PM, Nanley Chery <nanleych...@gmail.com> > > wrote: > > > > > > > On Mon, Oct 10, 2016 at 06:00:49PM -0700, Jason Ekstrand wrote: > > > > > On Mon, Oct 10, 2016 at 2:23 PM, Nanley Chery <nanleych...@gmail.com > > > > > > > wrote: > > > > > > > > > > > On Fri, Oct 07, 2016 at 09:41:14PM -0700, Jason Ekstrand wrote: > > > > > > > If we don't, we can end up with corruption in the portion of the > > > > depth > > > > > > > buffer that lies outside the render area when we do a HiZ > > resolve at > > > > the > > > > > > > end. The only reason we weren't seeing this before was that all > > of > > > > the > > > > > > > meta-based clears such as VkCmdClearDepthStencilImage were > > internally > > > > > > using > > > > > > > HiZ so the HiZ buffer never truly got out-of-sync. If the CTS > > ever > > > > > > tested > > > > > > > a depth upload (which doesn't care about HiZ) and then a partial > > > > render > > > > > > we > > > > > > > would have seen problems. Soon, we will be using blorp to do > > depth > > > > > > clears > > > > > > > and it won't bother with HiZ so we would get CTS regressions > > without > > > > > > this. > > > > > > > > > > > > > > > > > > > I understand the problem, but I think this solution unnecessarily > > > > > > penalizes the user's renderpass. > > > > > > > > > > > > Since depth buffer updates via vkCopy*ToImage and > > > > > > vkCmdClearDepthStencilImage cause the HiZ buffer to become stale, > > > > > > calling > > > > > > > > > > > > genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE); > > > > > > > > > > > > at the bottom of those commands should fix the issue without the > > extra > > > > > > penalty. I'd imagine that as a prequisite, blorp would have to > > learn to > > > > > > emit enough depth stencil state for this command. > > > > > > > > > > > > > > > > I think that's dangerously mixing HiZ data validity models. There > > are 3 > > > > > basic aux data validity models that we've thrown around: > > > > > > > > > > 1) AUX is always correct. > > > > > 2) AUX is correct within a render pass and invalid outside. > > > > > 3) Track whether or not AUX is valid and resolve only as needed. > > > > > > > > > > > > > What is the definition of correct here? I'd assume you mean that the > > > > data matches what's in the depth buffer, but that sometimes may not be > > > > the case (STORE_OP_DONTCARE) yet the program behavior is correct > > > > nonetheless. > > > > > > > > > > By "correct" I mean "consistent with the depth buffer" or, more > > precicely, > > > "all well-defined pixels of the depth buffer are consistent with the HiZ > > > buffer". We *may* be able to avoid the depth resolve at the end if you > > > have STORE_OP_DONT_CARE. However, we would probably not do anything > > > interesting with LOAD_OP_DONT_CARE. > > > > > > > > > With this definition of correct (accessing either buffer will give you > > the correct value due to their being consistent with each other), the > > current implementation is arguably a course-grained version of (3) (no > > tracking, let's call this 4) than it is (2). The HiZ buffer is only > > consistent with the depth buffer when a user performs an operation that > > likely requires it to be so. For example: > > > > * LOAD_OP_LOAD -> HiZ Resolve (consistent) > > * LOAD_OP_CLEAR -> No resolve, Fast Depth Clear (inconsistent) > > * vkCmdDraw* -> No resolve (inconsistent) > > * STORE_OP_STORE -> Depth Resolve (consistent) > > > > > > > > > Also, could you please explain where the danger comes into play? > > > > > > > > > > We need to have a solid mental model of when HiZ and depth are > > consistent. > > > Otherwise, we'll make mistakes, things will get inconsistent, and we'll > > > > Agreed. > > > > > have weird bugs. This bug is a good example of this. Our mental model > > (2) > > > works fine except that we were leaking garbage depth from DONT_CARE when > > we > > > have a partial areat. Just doing a HiZ resolve after a blorp clear > > "fixes" > > > the bug by making things always consistent (mental model 1). But then it > > > > As mentioned above, I'm not advocating mixing 1 and 2, but covering a > > missed case in 4. Whether or not that mental model is solid seems like > > a subjective claim. > > > > > means that we have LOAD_OP_LOAD, we're doing two HiZ resolves which we > > > don't want either. > > > > > > > I wouldn't expect Vulkan apps to submit image copies as frequently as > > render passes, so my thinking is that an extra HiZ resolve at the end of > > an Image copy should have less of an impact on FPS than performing the > > resolve on every clearing RP that doesn't use a full render area. I > > did not write the patch to test my suggestion, but I was able to get a > > measurable difference by forcing HiZ resolves on the triangle demo. I > > won't post the numbers I obtained from the questionable method of > > eye-balling the FPS counter (especially with the amount of variance > > involved), so feel free to take a look at it yourself. > > > > I wouldn't expect copies to happen particularly frequently either and I > certainly don't think we should optimize for them. > > > > I think the temporarily minor dip in performance introduced here is a > > small price to pay for the removal of meta. I also think my suggestion > > may be a lot more work than it seemed initially, so it's likely best to > > revisit this later. > > > > Agreed. I think we need to do some reading and thinking about image > layouts. I believe that's how we're intended to do these sorts of things > in Vulkan. My initial thought would be to have a mapping from image layout > to a "use HiZ" flag that would look something like this: > > GENERAL -> no > DEPTH_STENCIL_ATTACHMENT_OPTIMAL -> yes > SHADER_READ_ONLY_OPTIMAL -> gen >= 8 > TRANSFER_SRC_OPTIMAL -> gen >= 8 > TRANSFER_DST_OPTIMAL -> no > > And then, when they do a CmdPipelineBarrier that transitions the layout in > such a way that "use HiZ" goes from yes to no we do a depth resolve and > when it goes from no to yes, we do a HiZ resolve. The layout transitions > are per-layer and per-miplevel so the client is required to do basically > the exact same tracking we do in the GL driver today. > >
This seems like a good plan. -Nanley _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev