On Tue, Jan 30, 2018 at 04:25:59PM -0800, Jason Ekstrand wrote: > On Tue, Jan 30, 2018 at 2:54 PM, Nanley Chery <nanleych...@gmail.com> wrote: > > > On Fri, Jan 19, 2018 at 03:47:34PM -0800, Jason Ekstrand wrote: > > > Even though the blorp pass looks a bit on the sketchy side, the end > > > result in the Vulkan driver is very nice. Instead of having this weird > > > case where you do a fast clear and then maybe have to resolve, we just > > > do the ambiguate and are done with it. The ambiguate does exactly what > > > we want of setting all the CCS values to 0 which puts it inot the > > ^ > > in
Typo. > > > pass-through state. > > > > > > This should also improve performance a bit in certain cases. For > > > instance, if we did a transition from UNDEFINED to GENERAL for a surface > > > that doesn't have CCS enabled all the time, we would end up doing a > > > fast-clear and then a full resolve which ends up touching every byte in > > > the main surface as well as the CCS. With the ambiguate pass, that > > > transition only touches the CCS. > > > --- > > > src/intel/vulkan/anv_blorp.c | 5 ++++ > > > src/intel/vulkan/genX_cmd_buffer.c | 54 +++++++++--------------------- > > -------- > > > 2 files changed, 17 insertions(+), 42 deletions(-) > > > > > > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c > > > index 05efc6d..3698543 100644 > > > --- a/src/intel/vulkan/anv_blorp.c > > > +++ b/src/intel/vulkan/anv_blorp.c > > > @@ -1792,6 +1792,11 @@ anv_image_ccs_op(struct anv_cmd_buffer > > *cmd_buffer, > > > surf.surf->format, isl_to_blorp_fast_clear_op( > > ccs_op)); > > > break; > > > case ISL_AUX_OP_AMBIGUATE: > > > + for (uint32_t a = 0; a < layer_count; a++) { > > > + const uint32_t layer = base_layer + a; > > > + blorp_ccs_ambiguate(&batch, &surf, level, layer); > > > + } > > > + break; > > > default: > > > unreachable("Unsupported CCS operation"); > > > } > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > b/src/intel/vulkan/genX_cmd_buffer.c > > > index 77fdadf..9e2eba3 100644 > > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > > @@ -486,15 +486,6 @@ init_fast_clear_state_entry(struct anv_cmd_buffer > > *cmd_buffer, > > > uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect); > > > enum isl_aux_usage aux_usage = image->planes[plane].aux_usage; > > > > > > - /* The resolve flag should updated to signify that > > fast-clear/compression > > > - * data needs to be removed when leaving the undefined layout. Such > > data > > > - * may need to be removed if it would cause accesses to the color > > buffer > > > - * to return incorrect data. The fast clear data in CCS_D buffers > > should > > > - * be removed because CCS_D isn't enabled all the time. > > > - */ > > > - genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, > > > - aux_usage == ISL_AUX_USAGE_NONE); > > > - > > > /* The fast clear value dword(s) will be copied into a surface state > > object. > > > * Ensure that the restrictions of the fields in the dword(s) are > > followed. > > > * > > > @@ -677,10 +668,9 @@ transition_color_buffer(struct anv_cmd_buffer > > *cmd_buffer, > > > for (unsigned level = base_level; level < last_level_num; level++) > > > init_fast_clear_state_entry(cmd_buffer, image, aspect, level); > > > > > > - /* Initialize the aux buffers to enable correct rendering. This > > operation > > > - * requires up to two steps: one to rid the aux buffer of data > > that may > > > - * cause GPU hangs, and another to ensure that writes done > > without aux > > > - * will be visible to reads done with aux. > > > + /* Initialize the aux buffers to enable correct rendering. In > > order to > > > + * ensure that things such as storage images work correctly, aux > > buffers > > > + * are initialized to the pass-through state. > > > > Only CCS is initialized to the pass-through state while MCS is > > fast-cleared. We may also want to update the comment below since we're > > no longer fast-clearing CCS. > > > > Right. I've replaced this entire comment with: > > /* Initialize the aux buffers to enable correct rendering. In order > to > * ensure that things such as storage images work correctly, aux > buffers > * need to be initialized to valid data. > * > * Having an aux buffer with invalid data is a problem for two > reasons: > * > * 1) Having an invalid value in the buffer can confuse the hardware. > * For instance, with CCS_E on SKL, a two-bit CCS value of 2 is > * invalid and leads to the hardware doing strange things. It > * doesn't hang as far as we can tell but rendering corruption can > * occur. > * > * 2) If this transition is into the GENERAL layout and we then use > the > * image as a storage image, then we must have the aux buffer in > the > * pass-through state so that, if we then go to texture from the > * image, we get the results of our storage image writes and not > the > * fast clear color or other random data. This reads better. Is 2) an actual issue? When we create surface states for storage images, we set the aux_usage to NONE. - Nanley > * > * For CCS both of the problems above are real demonstrable issues. > In > * that case, the only thing we can do is to perform an ambiguate to > * transition the aux surface into the pass-through state. > * > * For MCS, (2) is never an issue because we don't support > multisampled > * storage images. In theory, issue (1) is a problem with MCS but > we've > * never seen it in the wild. For 4x and 16x, all bit patters could, > in > * theory, be interpreted as something but we don't know that all bit > * patterns are actually valid. For 2x and 8x, you could easily end > up > * with the MCS referring to an invalid plane because not all bits of > * the MCS value are actually used. Even though we've never seen > issues > * in the wild, it's best to play it safe and initialize the MCS. We > * can use a fast-clear for MCS because we only ever touch from render > * and texture (no image load store). > */ > > > * > > > * Having an aux buffer with invalid data is possible for CCS > > buffers > > > * SKL+ and for MCS buffers with certain sample counts (2x and > > 8x). One > > > @@ -692,16 +682,18 @@ transition_color_buffer(struct anv_cmd_buffer > > *cmd_buffer, > > > * We don't have any data to show that this is a problem, but we > > want to > > > * avoid causing difficult-to-debug problems. > > > */ > > > - if (GEN_GEN >= 9 && image->samples == 1) { > > > + if (image->samples == 1) { > > > for (uint32_t l = 0; l < level_count; l++) { > > > const uint32_t level = base_level + l; > > > const uint32_t level_layer_count = > > > MIN2(layer_count, anv_image_aux_layers(image, aspect, > > level)); > > > anv_image_ccs_op(cmd_buffer, image, aspect, level, > > > base_layer, level_layer_count, > > > - ISL_AUX_OP_FAST_CLEAR, false); > > > + ISL_AUX_OP_AMBIGUATE, false); > > > + genX(set_image_needs_resolve)(cmd_buffer, image, > > > + aspect, level, false); > > > } > > > - } else if (image->samples > 1) { > > > + } else { > > > if (image->samples == 4 || image->samples == 16) { > > > anv_perf_warn(cmd_buffer->device->instance, image, > > > "Doing a potentially unnecessary fast-clear > > to " > > > @@ -712,32 +704,10 @@ transition_color_buffer(struct anv_cmd_buffer > > *cmd_buffer, > > > anv_image_mcs_op(cmd_buffer, image, aspect, > > > base_layer, layer_count, > > > ISL_AUX_OP_FAST_CLEAR, false); > > > - } > > > - /* At this point, some elements of the CCS buffer may have the > > fast-clear > > > - * bit-arrangement. As the user writes to a subresource, we need > > to have > > > - * the associated CCS elements enter the ambiguated state. This > > enables > > > - * reads (implicit or explicit) to reflect the user-written data > > instead > > > - * of the clear color. The only time such elements will not > > change their > > > - * state as described above, is in a final layout that doesn't > > have CCS > > > - * enabled. In this case, we must force the associated CCS > > buffers of the > > > - * specified range to enter the ambiguated state in advance. > > > - */ > > > - if (image->samples == 1 && > > > - image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E && > > > - final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) { > > > - /* The CCS_D buffer may not be enabled in the final layout. > > Call this > > > - * function again with a initial layout of > > COLOR_ATTACHMENT_OPTIMAL > > > - * to perform a resolve. > > > - */ > > > - anv_perf_warn(cmd_buffer->device->instance, image, > > > - "Performing an additional resolve for CCS_D > > layout " > > > - "transition. Consider always leaving it on or " > > > - "performing an ambiguation pass."); > > > - transition_color_buffer(cmd_buffer, image, aspect, > > > - base_level, level_count, > > > - base_layer, layer_count, > > > - VK_IMAGE_LAYOUT_COLOR_ > > ATTACHMENT_OPTIMAL, > > > - final_layout); > > > > Nice to be rid of this. > > > > Yup. > > > > > + for (unsigned level = base_level; level < last_level_num; > > level++) { > > > + genX(set_image_needs_resolve)(cmd_buffer, image, > > > + aspect, level, true); > > > + } > > > > Why set needs resolve true here? > > > > There's no particularly good reason. I'll delete it. We never do MCS > resolves anyway. When the time comes for me to go hook up MCS fast-clears > again, we can sort it out then. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev