On Wed, 2018-01-17 at 13:07 -0800, Jason Ekstrand wrote: > On Wed, Jan 17, 2018 at 1:27 AM, Iago Toral <ito...@igalia.com> > wrote: > > On Tue, 2018-01-16 at 08:07 -0800, Jason Ekstrand wrote: > > > On Mon, Jan 8, 2018 at 4:57 AM, Iago Toral Quiroga <itoral@igalia > > > .com> wrote: > > > > From the Vulkan spec with KHX extensions: > > > > > > > > > > > > > > > > "If queries are used while executing a render pass instance > > > > that has > > > > > > > > multiview enabled, the query uses N consecutive query > > > > indices > > > > > > > > in the query pool (starting at query) where N is the number > > > > of bits > > > > > > > > set in the view mask in the subpass the query is used in. > > > > > > > > > > > > > > > > How the numerical results of the query are distributed among > > > > the > > > > > > > > queries is implementation-dependent. For example, some > > > > implementations > > > > > > > > may write each view's results to a distinct query, while > > > > other > > > > > > > > implementations may write the total result to the first > > > > query and write > > > > > > > > zero to the other queries. However, the sum of the results > > > > in all the > > > > > > > > queries must accurately reflect the total result of the > > > > query summed > > > > > > > > over all views. Applications can sum the results from all > > > > the queries to > > > > > > > > compute the total result." > > > > > > > > > > > > > > > > In our case we only really emit a single query (in the first > > > > query index) > > > > > > > > that stores the aggregated result for all views, but we still > > > > need to manage > > > > > > > > availability for all the other query indices involved, even if > > > > we don't > > > > > > > > actually use them. > > > > > > > > > > > > > > > > This is relevant when clients call vkGetQueryPoolResults and > > > > pass all N > > > > > > > > queries to retrieve the results. In that scenario, without this > > > > patch, > > > > > > > > we will never see queries other than the first being available > > > > since we > > > > > > > > never emit them. > > > > > > > > > > > > > > > > Fixes test failures in some work-in-progress CTS > > > > multiview+query tests. > > > > > > > > --- > > > > > > > > src/intel/vulkan/genX_query.c | 36 > > > > ++++++++++++++++++++++++++++++++++++ > > > > > > > > 1 file changed, 36 insertions(+) > > > > > > > > > > > > > > > > diff --git a/src/intel/vulkan/genX_query.c > > > > b/src/intel/vulkan/genX_query.c > > > > > > > > index 7683d0d1e3..231c605b6b 100644 > > > > > > > > --- a/src/intel/vulkan/genX_query.c > > > > > > > > +++ b/src/intel/vulkan/genX_query.c > > > > > > > > @@ -462,6 +462,24 @@ void genX(CmdEndQuery)( > > > > > > > > default: > > > > > > > > unreachable(""); > > > > > > > > } > > > > > > > > + > > > > > > > > + /* When multiview is active the spec requires that N > > > > consecutive query > > > > > > > > + * indices are used, where N is the number of active views > > > > in the subpass. > > > > > > > > + * The spec allows that we only write the results to one of > > > > the queries > > > > > > > > + * but we still need to manage result availability for all > > > > the query indices. > > > > > > > > + * Since we only emit a single query for all active views > > > > in the > > > > > > > > + * first index, mark the other query indices as being > > > > already available > > > > > > > > + * with result 0. > > > > > > > > + */ > > > > > > > > + if (!cmd_buffer->state.subpass || !cmd_buffer- > > > > >state.subpass->view_mask) > > > > > > > > + return; > > > > > > > > > > I think this would be better as just an if instead of an early > > > return. > > > > > > > + > > > > > > > > + uint32_t num_queries = _mesa_bitcount(cmd_buffer- > > > > >state.subpass->view_mask); > > > > > > > > + for (uint32_t i = 1; i < num_queries; i++) { > > > > > > > > + uint64_t *slot = pool->bo.map + (query + i) * pool- > > > > >stride; > > > > > > > > + slot[0] = 1; > > > > > > > > + memset(&slot[1], 0, sizeof(uint64_t) * pool->stride); > > > > > > > > > > We can't set this from the CPU, we need to emit an > > > MI_STORE_DATA_IMM and call emit_query_availability. > > > > Oh ok, I'll do that then. Thanks for the feedback! Out of > > curiosity, what is the problem with setting this on the CPU side? I > > was thinking that since we are not emitting these queries at all > > the GPU is not going to touch the memory for them anyway... > > Because they might re-use the query pool and a given slot may be > written from the CPU or the GPU depending on how they construct the > command buffer and they may switch back and forth for a given query. > Also, touching it from the CPU happens immediately instead of when > the command executes. The combination of these two things can lead > to some very bizarre results.
Oops, of course, queries are recorded in the command buffer, that was a silly question... Iago > > > > + } > > > > > > > > } > > > > > > > > > > > > > > > > #define TIMESTAMP 0x2358 > > > > > > > > @@ -504,6 +522,24 @@ void genX(CmdWriteTimestamp)( > > > > > > > > } > > > > > > > > > > > > > > > > emit_query_availability(cmd_buffer, &pool->bo, offset); > > > > > > > > + > > > > > > > > + /* When multiview is active the spec requires that N > > > > consecutive query > > > > > > > > + * indices are used, where N is the number of active views > > > > in the subpass. > > > > > > > > + * The spec allows that we only write the results to one of > > > > the queries > > > > > > > > + * but we still need to manage result availability for all > > > > the query indices. > > > > > > > > + * Since we only emit a single query for all active views > > > > in the > > > > > > > > + * first index, mark the other query indices as being > > > > already available > > > > > > > > + * with result 0. > > > > > > > > + */ > > > > > > > > + if (!cmd_buffer->state.subpass || !cmd_buffer- > > > > >state.subpass->view_mask) > > > > > > > > + return; > > > > > > > > + > > > > > > > > + uint32_t num_queries = _mesa_bitcount(cmd_buffer- > > > > >state.subpass->view_mask); > > > > > > > > + for (uint32_t i = 1; i < num_queries; i++) { > > > > > > > > + uint64_t *slot = pool->bo.map + (query + i) * pool- > > > > >stride; > > > > > > > > + slot[0] = 1; > > > > > > > > + memset(&slot[1], 0, sizeof(uint64_t) * pool->stride); > > > > > > > > > > Same comments here. > > > > > > > + } > > > > > > > > } > > > > > > > > > > > > > > > > #if GEN_GEN > 7 || GEN_IS_HASWELL > > > > > > > > -- > > > > > > > > 2.11.0 > > > > > > > > > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev