It looks good to me. Marek
On Tue, Mar 31, 2015 at 9:39 AM, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote: > > > On 03/30/2015 11:17 PM, Marek Olšák wrote: >> >> You can add a flag to each driver query identifying what kind of query >> it is (a hw perf counter or a CPU-only query). Then you can enumerate >> all queries and see if there's at least one perf counter and if so, >> advertise the extension. >> >> Or add a CAP and let drivers decide if they want the extension or not. >> You still need the per-query flag to skip the non-hw queries though. > > > I would prefer to add this GPU/CPU flags on groups instead of per-query > because it seems like much simpler to define groups of both GPU and CPU > counters. > Currently, I think only nvc0 driver exposes GPU hardware performance > counters So, I removed the patches for svga, freedreno and radeon drivers in > my series. > > In my opinion, other drivers which want to enable GL_AMD_perfmon extension > should *explicitly* define groups of GPU counters. > Currently, that flag is called PIPE_DRIVER_QUERY_GROUP_TYPE_{CPU,GPU}. > > If you want to have a quick look, the code is located under the > gl_amd_perfmon_v3 branch. > http://cgit.freedesktop.org/~hakzsam/mesa/log/?h=gl_amd_perfmon_v3 > > Let me know if you have strong objections against it. > > >> >> Marek >> >> On Mon, Mar 30, 2015 at 3:06 PM, Samuel Pitoiset >> <samuel.pitoi...@gmail.com> wrote: >>> >>> >>> On 03/29/2015 06:11 PM, Martin Peres wrote: >>>> >>>> On 29/03/2015 17:56, Samuel Pitoiset wrote: >>>>> >>>>> >>>>> On 03/28/2015 09:43 PM, Martin Peres wrote: >>>>>> >>>>>> On 22/03/2015 17:35, Samuel Pitoiset wrote: >>>>>>> >>>>>>> From: Christoph Bumiller <e0425...@student.tuwien.ac.at> >>>>>>> >>>>>>> This is based on the original patch of Christoph Bumiller. >>>>>>> (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) >>>>>> >>>>>> It would be nice if you could add "v2: Samuel Pitoiset" and tell what >>>>>> you changed. Christoph may delete his perfmon.diff and no-one will be >>>>>> able to diff the diffs :) >>>>> >>>>> Good idea! >>>>> >>>>>>> As for the Gallium HUD, we keep a list of busy queries in a ring >>>>>>> buffer in order to prevent stalls when reading queries. >>>>>>> >>>>>>> Drivers must implement get_driver_query_group_info and >>>>>>> get_driver_query_info in order to enable this extension. >>>>>>> >>>>>>> Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> >>>>>>> --- >>>>>>> src/mesa/Makefile.sources | 2 + >>>>>>> src/mesa/state_tracker/st_cb_perfmon.c | 455 >>>>>>> +++++++++++++++++++++++++++++++++ >>>>>>> src/mesa/state_tracker/st_cb_perfmon.h | 70 +++++ >>>>>>> src/mesa/state_tracker/st_context.c | 4 + >>>>>>> src/mesa/state_tracker/st_extensions.c | 3 + >>>>>>> 5 files changed, 534 insertions(+) >>>>>>> create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c >>>>>>> create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h >>>>>>> >>>>>>> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources >>>>>>> index 217be9a..e54e618 100644 >>>>>>> --- a/src/mesa/Makefile.sources >>>>>>> +++ b/src/mesa/Makefile.sources >>>>>>> @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ >>>>>>> state_tracker/st_cb_flush.h \ >>>>>>> state_tracker/st_cb_msaa.c \ >>>>>>> state_tracker/st_cb_msaa.h \ >>>>>>> + state_tracker/st_cb_perfmon.c \ >>>>>>> + state_tracker/st_cb_perfmon.h \ >>>>>>> state_tracker/st_cb_program.c \ >>>>>>> state_tracker/st_cb_program.h \ >>>>>>> state_tracker/st_cb_queryobj.c \ >>>>>>> diff --git a/src/mesa/state_tracker/st_cb_perfmon.c >>>>>>> b/src/mesa/state_tracker/st_cb_perfmon.c >>>>>>> new file mode 100644 >>>>>>> index 0000000..fb6774b >>>>>>> --- /dev/null >>>>>>> +++ b/src/mesa/state_tracker/st_cb_perfmon.c >>>>>>> @@ -0,0 +1,455 @@ >>>>>>> +/* >>>>>>> + * Copyright (C) 2013 Christoph Bumiller >>>>>>> + * Copyright (C) 2015 Samuel Pitoiset >>>>>>> + * >>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>> obtaining a >>>>>>> + * copy of this software and associated documentation files (the >>>>>>> "Software"), >>>>>>> + * to deal in the Software without restriction, including without >>>>>>> limitation >>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>> sublicense, >>>>>>> + * and/or sell copies of the Software, and to permit persons to whom >>>>>>> the >>>>>>> + * Software is furnished to do so, subject to the following >>>>>>> conditions: >>>>>>> + * >>>>>>> + * The above copyright notice and this permission notice shall be >>>>>>> included in >>>>>>> + * all copies or substantial portions of the Software. >>>>>>> + * >>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>>>>> EXPRESS OR >>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>> MERCHANTABILITY, >>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>> EVENT SHALL >>>>>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES >>>>>>> OR >>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>> OTHERWISE, >>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>>>> USE OR >>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>> + */ >>>>>>> + >>>>>>> +/** >>>>>>> + * Performance monitoring counters interface to gallium. >>>>>>> + */ >>>>>>> + >>>>>>> +#include "st_context.h" >>>>>>> +#include "st_cb_bitmap.h" >>>>>>> +#include "st_cb_perfmon.h" >>>>>>> + >>>>>>> +#include "util/bitset.h" >>>>>>> + >>>>>>> +#include "pipe/p_context.h" >>>>>>> +#include "pipe/p_screen.h" >>>>>>> +#include "util/u_memory.h" >>>>>>> + >>>>>>> +/** >>>>>>> + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 >>>>>>> if >>>>>>> + * the driver-specific query doesn't exist. >>>>>>> + */ >>>>>>> +static int >>>>>>> +find_query_type(struct pipe_screen *screen, const char *name) >>>>>>> +{ >>>>>>> + int num_queries; >>>>>>> + int type = -1; >>>>>>> + int i; >>>>>>> + >>>>>>> + num_queries = screen->get_driver_query_info(screen, 0, NULL); >>>>>>> + if (!num_queries) >>>>>>> + return type; >>>>>>> + >>>>>>> + for (i = 0; i < num_queries; i++) { >>>>>>> + struct pipe_driver_query_info info; >>>>>>> + >>>>>>> + if (!screen->get_driver_query_info(screen, i, &info)) >>>>>>> + continue; >>>>>>> + >>>>>>> + if (!strncmp(info.name, name, strlen(name))) { >>>>>>> + type = info.query_type; >>>>>>> + break; >>>>>>> + } >>>>>>> + } >>>>>>> + return type; >>>>>>> +} >>>>>>> + >>>>>>> +static bool >>>>>>> +init_perf_monitor(struct gl_context *ctx, struct >>>>>>> gl_perf_monitor_object *m) >>>>>>> +{ >>>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>>> + struct pipe_screen *screen = st_context(ctx)->pipe->screen; >>>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>>> + int gid, cid; >>>>>>> + >>>>>>> + st_flush_bitmap_cache(st_context(ctx)); >>>>>>> + >>>>>>> + /* Create a query for each active counter. */ >>>>>>> + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { >>>>>>> + const struct gl_perf_monitor_group *g = >>>>>>> &ctx->PerfMonitor.Groups[gid]; >>>>>>> + for (cid = 0; cid < g->NumCounters; cid++) { >>>>>>> + const struct gl_perf_monitor_counter *c = >>>>>>> &g->Counters[cid]; >>>>>>> + struct st_perf_counter_object *cntr; >>>>>>> + int query_type; >>>>>>> + >>>>>>> + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) >>>>>>> + continue; >>>>>> >>>>>> It would seem like the extension would not work with more than 32 >>>>>> counters per group. >>>>>> >>>>>> This certainly is not a problem on the NVIDIA side but it may become a >>>>>> problem for another >>>>>> GPU manufacturer. It may warrant a note disclosing this limitation. >>>>>> What do you think? >>>>> >>>>> Ok, I'll add a note for that. >>>>> >>>>>>> + >>>>>>> + query_type = find_query_type(screen, c->Name); >>>>>>> + assert(query_type != -1); >>>>>>> + >>>>>>> + cntr = CALLOC_STRUCT(st_perf_counter_object); >>>>>>> + if (!cntr) >>>>>>> + return false; >>>>>>> + >>>>>>> + cntr->queries[cntr->head] = pipe->create_query(pipe, >>>>>>> query_type, 0); >>>>>>> + cntr->query_type = query_type; >>>>>>> + cntr->id = cid; >>>>>>> + cntr->group_id = gid; >>>>>>> + >>>>>>> + list_addtail(&cntr->list, &stm->active_counters); >>>>>>> + } >>>>>>> + } >>>>>>> + return true; >>>>>>> +} >>>>>>> + >>>>>>> +static void >>>>>>> +reset_perf_monitor(struct st_perf_monitor_object *stm, >>>>>>> + struct pipe_context *pipe) >>>>>>> +{ >>>>>>> + struct st_perf_counter_object *cntr, *tmp; >>>>>>> + int i; >>>>>>> + >>>>>>> + LIST_FOR_EACH_ENTRY_SAFE(cntr, tmp, &stm->active_counters, list) >>>>>>> { >>>>>>> + for (i = 0; i < ARRAY_SIZE(cntr->queries); i++) { >>>>>>> + if (cntr->queries[i]) >>>>>>> + pipe->destroy_query(pipe, cntr->queries[i]); >>>>>>> + } >>>>>>> + list_del(&cntr->list); >>>>>>> + free(cntr); >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +static unsigned >>>>>>> +read_query_results(struct pipe_context *pipe, GLenum type, >>>>>>> + struct st_perf_counter_object *cntr, >>>>>>> + union pipe_query_result *results) >>>>>>> +{ >>>>>>> + unsigned num_results = 0; >>>>>>> + >>>>>>> + while (1) { >>>>>>> + union pipe_query_result result; >>>>>>> + >>>>>>> + if (!pipe->get_query_result(pipe, cntr->queries[cntr->tail], >>>>>>> + FALSE, &result)) >>>>>>> + break; >>>>>>> + >>>>>>> + switch (type) { >>>>>>> + case GL_UNSIGNED_INT64_AMD: >>>>>>> + (*results).u64 += result.u64; >>>>>>> + break; >>>>>>> + case GL_UNSIGNED_INT: >>>>>>> + (*results).u32 += result.u32; >>>>>>> + break; >>>>>>> + case GL_FLOAT: >>>>>>> + case GL_PERCENTAGE_AMD: >>>>>>> + (*results).f += result.f; >>>>>>> + break; >>>>>>> + } >>>>>>> + num_results++; >>>>>>> + >>>>>>> + if (cntr->tail == cntr->head) >>>>>>> + break; >>>>>>> + >>>>>>> + cntr->tail = (cntr->tail + 1) % ST_PERFMON_NUM_QUERIES; >>>>>>> + } >>>>>>> + return num_results; >>>>>>> +} >>>>>>> + >>>>>>> +static struct gl_perf_monitor_object * >>>>>>> +st_NewPerfMonitor() >>>>>>> +{ >>>>>>> + struct st_perf_monitor_object *stq = >>>>>>> ST_CALLOC_STRUCT(st_perf_monitor_object); >>>>>>> + if (stq) { >>>>>>> + list_inithead(&stq->active_counters); >>>>>>> + return &stq->base; >>>>>>> + } >>>>>>> + return NULL; >>>>>>> +} >>>>>>> + >>>>>>> +static void >>>>>>> +st_DeletePerfMonitor(struct gl_context *ctx, struct >>>>>>> gl_perf_monitor_object *m) >>>>>>> +{ >>>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>>> + >>>>>>> + reset_perf_monitor(stm, pipe); >>>>>>> + FREE(stm); >>>>>>> +} >>>>>>> + >>>>>>> +static GLboolean >>>>>>> +st_BeginPerfMonitor(struct gl_context *ctx, struct >>>>>>> gl_perf_monitor_object *m) >>>>>>> +{ >>>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>>> + struct st_perf_counter_object *cntr; >>>>>>> + int gid; >>>>>>> + >>>>>>> + /* Check the number of active counters for each group. */ >>>>>>> + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { >>>>>>> + const struct gl_perf_monitor_group *g = >>>>>>> &ctx->PerfMonitor.Groups[gid]; >>>>>>> + if (m->ActiveGroups[gid] > g->MaxActiveCounters) { >>>>>>> + /* Maximum number of counters reached. Cannot start the >>>>>>> session. */ >>>>>>> + return false; >>>>>>> + } >>>>>>> + } >>>>>> >>>>>> I was about to complain that you did not check for errors returned by >>>>>> pipe_query, but I guess that with the above test, only the lying >>>>>> drivers would get impacted. It is going to be hard for Nouveau to >>>>>> allow more than 1 counter at a time because of the B6 ones... >>>>> >>>>> This extension has not been designed for NVIDIA hardware. :-) >>>>> But yes, it will not really possible to monitor more than 1 counter per >>>>> group for "special" counting modes like B4/B6. >>>> >>>> >>>> As discussed on IRC, it will still be possible, but it will be more >>>> annoying for applications. We can discuss about this later when the core >>>> gallium code is in place :) >>>> >>>>>>> + >>>>>>> + if (LIST_IS_EMPTY(&stm->active_counters)) { >>>>>>> + /* Create queries for each active counter before starting >>>>>>> + * a new monitoring session. */ >>>>>>> + if (!init_perf_monitor(ctx, m)) >>>>>>> + goto fail; >>>>>>> + } else { >>>>>>> + /* As for the Gallium HUD, we keep a list of busy queries in a >>>>>>> ring >>>>>>> + * buffer in order to prevent stalls when reading queries. */ >>>>>>> + LIST_FOR_EACH_ENTRY(cntr, &stm->active_counters, list) { >>>>>>> + if ((cntr->head + 1) % ST_PERFMON_NUM_QUERIES == >>>>>>> cntr->tail) >>>>>>> { >>>>>>> + /* All queries are busy, throw away the last query and >>>>>>> create >>>>>>> + * a new one. */ >>>>>>> + pipe->destroy_query(pipe, cntr->queries[cntr->head]); >>>>>>> + cntr->queries[cntr->head] = >>>>>>> + pipe->create_query(pipe, cntr->query_type, 0); >>>>>> >>>>>> Shouldn't you block here, waiting for the oldest query to complete? By >>>>>> dropping some queries, you may create a deadlock if the application >>>>>> submits a ton of queries and then waits for them sequentially. >>>>>> >>>>>> An easy fix is to make sure that ST_PERFMON_NUM_QUERIES > 2 * >>>>>> g->MaxActiveCounters, then, you can simply drop the oldest request. >>>>>> Here is why: >>>>> >>>>> Yeah, this was a bit insane, I'll make the changes and try to avoid >>>>> that >>>>> possible deadlock. >>>> >>>> >>>> Wonderful! >>>>> >>>>> >>>>>> - The spec forbids nested BeginPerfMonitor calls. >>>>>> - EndPerfMonitor make previous counter values impossible to reach. >>>>>> >>>>>> So, the worst case scenario is the following: >>>>>> BeginPerfMonitor(mon=1, ...) >>>>>> EndPerfMonitor(mon=1, ...) >>>>>> BeginPerfMonitor(mon=1, ...) >>>>>> GetPerfMonitorCounterData(mon=1, ...) >>>>> >>>>> No, this scenario is not possible. >>>>> This is actually not clearly specified by the spec but the "core >>>>> support" in mesa doesn't allow to read counters data while a monitoring >>>>> sessions is active, and this makes sense. >>>> >>>> >>>> In this case, you only need to have ST_PERFMON_NUM_QUERIES == >>>> g->MaxActiveCounters >>>>> >>>>> >>>>>> Anything that happened before the first BeginPerf here is irrelevant >>>>>> and can be safely optimised away. >>>>>> >>>>>> I would thus suggest dynamically allocating this ring buffer and >>>>>> dropping the oldest requests. To make the code simpler, you could >>>>>> simply always select the oldest. >>>>>> >>>>>> Any thoughts on this? >>>>>> >>>>>> + } else { >>>>>> + /* Add a new query for this frame. */ >>>>>> + cntr->head = (cntr->head + 1) % ST_PERFMON_NUM_QUERIES; >>>>>> + if (!cntr->queries[cntr->head]) { >>>>>> + cntr->queries[cntr->head] = >>>>>> + pipe->create_query(pipe, cntr->query_type, 0); >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + /* Start the newest query for each active counter. */ >>>>>> + LIST_FOR_EACH_ENTRY(cntr, &stm->active_counters, list) { >>>>>> + if (!pipe->begin_query(pipe, cntr->queries[cntr->head])) >>>>>> + goto fail; >>>>>> + } >>>>>> + return true; >>>>>> + >>>>>> +fail: >>>>>> + /* Failed to start the monitoring session. */ >>>>>> + reset_perf_monitor(stm, pipe); >>>>>> + return false; >>>>>> +} >>>>>> + >>>>>> +static void >>>>>> +st_EndPerfMonitor(struct gl_context *ctx, struct >>>>>> gl_perf_monitor_object *m) >>>>>> +{ >>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>> + struct st_perf_counter_object *cntr; >>>>>> + >>>>>> + /* Stop the newest query for each active counter. */ >>>>>> + LIST_FOR_EACH_ENTRY(cntr, &stm->active_counters, list) >>>>>> + pipe->end_query(pipe, cntr->queries[cntr->head]); >>>>>> +} >>>>>> + >>>>>> +static void >>>>>> +st_ResetPerfMonitor(struct gl_context *ctx, struct >>>>>> gl_perf_monitor_object *m) >>>>>> +{ >>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>> + >>>>>> + if (!m->Ended) >>>>>> + st_EndPerfMonitor(ctx, m); >>>>>> + >>>>>> + reset_perf_monitor(stm, pipe); >>>>>> + >>>>>> + if (m->Active) >>>>>> + st_BeginPerfMonitor(ctx, m); >>>>>> +} >>>>>> + >>>>>> +static GLboolean >>>>>> +st_IsPerfMonitorResultAvailable(struct gl_context *ctx, >>>>>> + struct gl_perf_monitor_object *m) >>>>>> +{ >>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>> + struct st_perf_counter_object *cntr; >>>>>> + >>>>>> + if (LIST_IS_EMPTY(&stm->active_counters)) >>>>>> + return false; >>>>>> + >>>>>> + /* The result of a monitoring session is only available if the >>>>>> oldest >>>>>> + * query of each active counter is idle. */ >>>>>> + LIST_FOR_EACH_ENTRY(cntr, &stm->active_counters, list) { >>>>>> + union pipe_query_result result; >>>>>> + if (!pipe->get_query_result(pipe, cntr->queries[cntr->tail], >>>>>> + FALSE, &result)) { >>>>>> + /* The oldest query is busy. */ >>>>>> + return false; >>>>>> + } >>>>>> + } >>>>>> + return true; >>>>>> +} >>>>>> + >>>>>> +static void >>>>>> +st_GetPerfMonitorResult(struct gl_context *ctx, >>>>>> + struct gl_perf_monitor_object *m, >>>>>> + GLsizei dataSize, >>>>>> + GLuint *data, >>>>>> + GLint *bytesWritten) >>>>>> +{ >>>>>> + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); >>>>>> + struct pipe_context *pipe = st_context(ctx)->pipe; >>>>>> + struct st_perf_counter_object *cntr; >>>>>> + >>>>>> + /* Copy data to the supplied array (data). >>>>>> + * >>>>>> + * The output data format is: <group ID, counter ID, value> for >>>>>> each >>>>>> + * active counter. The API allows counters to appear in any order. >>>>>> + */ >>>>>> + GLsizei offset = 0; >>>>>> + >>>>>> + /* Read query results for each active counter. */ >>>>>> + LIST_FOR_EACH_ENTRY(cntr, &stm->active_counters, list) { >>>>>> + union pipe_query_result results = {}; >>>>>> + unsigned num_results = 0; >>>>>> + int gid, cid; >>>>>> + GLenum type; >>>>>> + >>>>>> + cid = cntr->id; >>>>>> + gid = cntr->group_id; >>>>>> + type = ctx->PerfMonitor.Groups[gid].Counters[cid].Type; >>>>>> + >>>>>> + num_results = read_query_results(pipe, type, cntr, &results); >>>>>> + if (!num_results) >>>>>> + continue; >>>>>> >>>>>> It seems like you do not allow the case I presented earlier >>>>>> (GetPerfMonitorCounterData after BeginPerfMonitor) unless you rely on >>>>>> the drivers to do the right thing? >>>>> >>>>> See above. >>>>> >>>>>> + >>>>>> + data[offset++] = gid; >>>>>> + data[offset++] = cid; >>>>>> + switch (type) { >>>>>> + case GL_UNSIGNED_INT64_AMD: >>>>>> + *(uint64_t *)&data[offset] = results.u64 / num_results; >>>>>> + offset += sizeof(uint64_t) / sizeof(GLuint); >>>>>> + break; >>>>>> + case GL_UNSIGNED_INT: >>>>>> + *(uint32_t *)&data[offset] = results.u32 / num_results; >>>>>> + offset += sizeof(uint32_t) / sizeof(GLuint); >>>>>> + break; >>>>>> + case GL_FLOAT: >>>>>> + case GL_PERCENTAGE_AMD: >>>>>> + *(GLfloat *)&data[offset] = results.f / (float)num_results; >>>>>> + offset += sizeof(GLfloat) / sizeof(GLuint); >>>>>> + break; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + if (bytesWritten) >>>>>> + *bytesWritten = offset * sizeof(GLuint); >>>>>> +} >>>>>> + >>>>>> +bool >>>>>> +st_init_perfmon(struct st_context *st) >>>>>> +{ >>>>>> + struct gl_perf_monitor_state *perfmon = &st->ctx->PerfMonitor; >>>>>> + struct pipe_screen *screen = st->pipe->screen; >>>>>> + struct gl_perf_monitor_group *groups = NULL; >>>>>> + int num_counters, num_groups; >>>>>> + int gid, cid; >>>>>> + >>>>>> + if (!screen->get_driver_query_info || >>>>>> !screen->get_driver_query_group_info) >>>>>> + return false; >>>>>> + >>>>>> + /* Get the number of available queries. */ >>>>>> + num_counters = screen->get_driver_query_info(screen, 0, NULL); >>>>>> + if (!num_counters) >>>>>> + return false; >>>>>> + >>>>>> + /* Get the number of available groups. */ >>>>>> + num_groups = screen->get_driver_query_group_info(screen, 0, NULL); >>>>>> + if (num_groups) >>>>>> + groups = CALLOC(num_groups, sizeof(*groups)); >>>>>> + if (!groups) >>>>>> + return false; >>>>>> + >>>>>> + for (gid = 0; gid < num_groups; gid++) { >>>>>> + struct gl_perf_monitor_group *g = &groups[gid]; >>>>>> + struct pipe_driver_query_group_info group_info; >>>>>> + struct gl_perf_monitor_counter *counters = NULL; >>>>>> + >>>>>> + if (!screen->get_driver_query_group_info(screen, gid, >>>>>> &group_info)) >>>>>> + continue; >>>>>> + >>>>>> + g->Name = group_info.name; >>>>>> + g->MaxActiveCounters = group_info.max_active_queries; >>>>>> + g->NumCounters = 0; >>>>>> + g->Counters = NULL; >>>>>> + >>>>>> + if (group_info.num_queries) >>>>>> + counters = CALLOC(group_info.num_queries, >>>>>> sizeof(*counters)); >>>>>> + if (!counters) >>>>>> + goto fail; >>>>>> + >>>>>> + for (cid = 0; cid < num_counters; cid++) { >>>>>> + struct gl_perf_monitor_counter *c = >>>>>> &counters[g->NumCounters]; >>>>>> + struct pipe_driver_query_info info; >>>>>> + >>>>>> + if (!screen->get_driver_query_info(screen, cid, &info)) >>>>>> + continue; >>>>>> + if (info.group_id != gid) >>>>>> + continue; >>>>>> + >>>>>> + c->Name = info.name; >>>>>> + switch (info.type) { >>>>>> + case PIPE_DRIVER_QUERY_TYPE_UINT64: >>>>>> + c->Minimum.u64 = 0; >>>>>> + c->Maximum.u64 = info.max_value.u64; >>>>>> + c->Type = GL_UNSIGNED_INT64_AMD; >>>>>> + break; >>>>>> + case PIPE_DRIVER_QUERY_TYPE_UINT: >>>>>> + c->Minimum.u32 = 0; >>>>>> + c->Maximum.u32 = info.max_value.u32; >>>>>> + c->Type = GL_UNSIGNED_INT; >>>>>> + break; >>>>>> + case PIPE_DRIVER_QUERY_TYPE_FLOAT: >>>>>> + c->Minimum.f = 0.0; >>>>>> + c->Maximum.f = info.max_value.f; >>>>>> + c->Type = GL_FLOAT; >>>>>> + break; >>>>>> + case PIPE_DRIVER_QUERY_TYPE_PERCENTAGE: >>>>>> + c->Minimum.f = 0.0; >>>>>> + c->Maximum.f = 100.0; >>>>>> + c->Type = GL_PERCENTAGE_AMD; >>>>>> + break; >>>>>> + default: >>>>>> + unreachable("Should never happen: invalid driver query >>>>>> type"); >>>>>> >>>>>> >>>>>> The definition of "unreachable" should be enough, no need to add >>>>>> "Should never happen:" in my opinion. >>>>> >>>>> Sure. >>>>> >>>>>>> + } >>>>>>> + g->NumCounters++; >>>>>>> + } >>>>>>> + g->Counters = counters; >>>>>>> + } >>>>>>> + >>>>>>> + perfmon->NumGroups = num_groups; >>>>>>> + perfmon->Groups = groups; >>>>>>> + return true; >>>>>>> + >>>>>>> +fail: >>>>>>> + for (gid = 0; gid < num_groups; gid++) >>>>>>> + FREE((void *)groups[gid].Counters); >>>>>>> + FREE(groups); >>>>>>> + return false; >>>>>>> +} >>>>>>> + >>>>>>> +void >>>>>>> +st_destroy_perfmon(struct st_context *st) >>>>>>> +{ >>>>>>> + struct gl_perf_monitor_state *perfmon = &st->ctx->PerfMonitor; >>>>>>> + int gid; >>>>>>> + >>>>>>> + for (gid = 0; gid < perfmon->NumGroups; gid++) >>>>>>> + FREE((void *)perfmon->Groups[gid].Counters); >>>>>>> + FREE((void *)perfmon->Groups); >>>>>>> +} >>>>>>> + >>>>>>> +void st_init_perfmon_functions(struct dd_function_table *functions) >>>>>>> +{ >>>>>>> + functions->NewPerfMonitor = st_NewPerfMonitor; >>>>>>> + functions->DeletePerfMonitor = st_DeletePerfMonitor; >>>>>>> + functions->BeginPerfMonitor = st_BeginPerfMonitor; >>>>>>> + functions->EndPerfMonitor = st_EndPerfMonitor; >>>>>>> + functions->ResetPerfMonitor = st_ResetPerfMonitor; >>>>>>> + functions->IsPerfMonitorResultAvailable = >>>>>>> st_IsPerfMonitorResultAvailable; >>>>>>> + functions->GetPerfMonitorResult = st_GetPerfMonitorResult; >>>>>> >>>>>> I don't see SelectPerfMonitorCountersAMD, who defined the entry >>>>>> points? Seems like I am missing something here. Are you piggy backing >>>>>> on the intel implementation? >>>>> >>>>> SelectPerfMonitorCountersAMD is defined in >>>>> src/mesa/main/performance_monitor.c >>>> >>>> >>>> Ok, I will have a look at it later. Thanks. >>>> >>>>>>> +} >>>>>>> diff --git a/src/mesa/state_tracker/st_cb_perfmon.h >>>>>>> b/src/mesa/state_tracker/st_cb_perfmon.h >>>>>>> new file mode 100644 >>>>>>> index 0000000..ba5bf04 >>>>>>> --- /dev/null >>>>>>> +++ b/src/mesa/state_tracker/st_cb_perfmon.h >>>>>>> @@ -0,0 +1,70 @@ >>>>>>> +/* >>>>>>> + * Copyright (C) 2013 Christoph Bumiller >>>>>>> + * Copyright (C) 2015 Samuel Pitoiset >>>>>>> + * >>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>> obtaining a >>>>>>> + * copy of this software and associated documentation files (the >>>>>>> "Software"), >>>>>>> + * to deal in the Software without restriction, including without >>>>>>> limitation >>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>> sublicense, >>>>>>> + * and/or sell copies of the Software, and to permit persons to whom >>>>>>> the >>>>>>> + * Software is furnished to do so, subject to the following >>>>>>> conditions: >>>>>>> + * >>>>>>> + * The above copyright notice and this permission notice shall be >>>>>>> included in >>>>>>> + * all copies or substantial portions of the Software. >>>>>>> + * >>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>>>>> EXPRESS OR >>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>> MERCHANTABILITY, >>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>> EVENT SHALL >>>>>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES >>>>>>> OR >>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>> OTHERWISE, >>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>>>> USE OR >>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>> + */ >>>>>>> + >>>>>>> +#ifndef ST_CB_PERFMON_H >>>>>>> +#define ST_CB_PERFMON_H >>>>>>> + >>>>>>> +#include "util/u_double_list.h" >>>>>>> + >>>>>>> +#define ST_PERFMON_NUM_QUERIES 8 >>>>>>> + >>>>>>> +/** >>>>>>> + * Subclass of gl_perf_monitor_object >>>>>>> + */ >>>>>>> +struct st_perf_monitor_object >>>>>>> +{ >>>>>>> + struct gl_perf_monitor_object base; >>>>>>> + struct list_head active_counters; >>>>>>> +}; >>>>>>> + >>>>>>> +struct st_perf_counter_object >>>>>>> +{ >>>>>>> + struct list_head list; >>>>>>> + int query_type; >>>>>>> + int id; >>>>>>> + int group_id; >>>>>>> + >>>>>>> + /* Ring of queries. If a query is busy, we use another slot. */ >>>>>>> + struct pipe_query *queries[ST_PERFMON_NUM_QUERIES]; >>>>>>> + unsigned head, tail; >>>>>>> +}; >>>>>>> + >>>>>>> +/** >>>>>>> + * Cast wrapper >>>>>>> + */ >>>>>>> +static INLINE struct st_perf_monitor_object * >>>>>>> +st_perf_monitor_object(struct gl_perf_monitor_object *q) >>>>>>> +{ >>>>>>> + return (struct st_perf_monitor_object *)q; >>>>>>> +} >>>>>>> + >>>>>>> +bool >>>>>>> +st_init_perfmon(struct st_context *st); >>>>>>> + >>>>>>> +void >>>>>>> +st_destroy_perfmon(struct st_context *st); >>>>>>> + >>>>>>> +extern void >>>>>>> +st_init_perfmon_functions(struct dd_function_table *functions); >>>>>>> + >>>>>>> +#endif >>>>>>> diff --git a/src/mesa/state_tracker/st_context.c >>>>>>> b/src/mesa/state_tracker/st_context.c >>>>>>> index 5fe132a..2146424 100644 >>>>>>> --- a/src/mesa/state_tracker/st_context.c >>>>>>> +++ b/src/mesa/state_tracker/st_context.c >>>>>>> @@ -51,6 +51,7 @@ >>>>>>> #include "st_cb_fbo.h" >>>>>>> #include "st_cb_feedback.h" >>>>>>> #include "st_cb_msaa.h" >>>>>>> +#include "st_cb_perfmon.h" >>>>>>> #include "st_cb_program.h" >>>>>>> #include "st_cb_queryobj.h" >>>>>>> #include "st_cb_readpixels.h" >>>>>>> @@ -116,6 +117,7 @@ st_destroy_context_priv(struct st_context *st) >>>>>>> st_destroy_bitmap(st); >>>>>>> st_destroy_drawpix(st); >>>>>>> st_destroy_drawtex(st); >>>>>>> + st_destroy_perfmon(st); >>>>>>> for (shader = 0; shader < >>>>>>> ARRAY_SIZE(st->state.sampler_views); >>>>>>> shader++) { >>>>>>> for (i = 0; i < ARRAY_SIZE(st->state.sampler_views[0]); >>>>>>> i++) { >>>>>>> @@ -190,6 +192,7 @@ st_create_context_priv( struct gl_context *ctx, >>>>>>> struct pipe_context *pipe, >>>>>>> st_init_bitmap(st); >>>>>>> st_init_clear(st); >>>>>>> st_init_draw( st ); >>>>>>> + st_init_perfmon(st); >>>>>> >>>>>> Why is there no checks here on the return value? >>>>> >>>>> Good catch! >>>>> >>>>>>> /* Choose texture target for glDrawPixels, glBitmap, >>>>>>> renderbuffers */ >>>>>>> if (pipe->screen->get_param(pipe->screen, >>>>>>> PIPE_CAP_NPOT_TEXTURES)) >>>>>>> @@ -414,6 +417,7 @@ void st_init_driver_functions(struct >>>>>>> dd_function_table *functions) >>>>>>> st_init_fbo_functions(functions); >>>>>>> st_init_feedback_functions(functions); >>>>>>> st_init_msaa_functions(functions); >>>>>>> + st_init_perfmon_functions(functions); >>>>>>> st_init_program_functions(functions); >>>>>>> st_init_query_functions(functions); >>>>>>> st_init_cond_render_functions(functions); >>>>>>> diff --git a/src/mesa/state_tracker/st_extensions.c >>>>>>> b/src/mesa/state_tracker/st_extensions.c >>>>>>> index bc20f73..5b6d6c6 100644 >>>>>>> --- a/src/mesa/state_tracker/st_extensions.c >>>>>>> +++ b/src/mesa/state_tracker/st_extensions.c >>>>>>> @@ -629,6 +629,9 @@ void st_init_extensions(struct pipe_screen >>>>>>> *screen, >>>>>>> extensions->OES_EGL_image_external = GL_TRUE; >>>>>>> extensions->OES_draw_texture = GL_TRUE; >>>>>>> + if (screen->get_driver_query_info && >>>>>>> screen->get_driver_query_group_info) >>>>>>> + extensions->AMD_performance_monitor = GL_TRUE; >>>>>> >>>>>> Isn't this a bit silly to enable this extension if a driver does not >>>>>> export any group or counter? >>>>>> The return value of st_init_perfmon() would be well suited for >>>>>> enabling or not the extension. >>>>>> >>>>>> I am OK with the current solution though as driver devs should not >>>>>> export those functions >>>>>> unless they do have counters to expose. >>>>> >>>>> I don't think this is silly because there is no point to define those >>>>> functions if the driver doesn't expose any counters. >>>> >>>> >>>> You mean it would be silly for the driver dev to export those functions >>>> if >>>> they had not counters? Yes, I guess you are right. >>> >>> >>> Well, actually it's a bad solution to expose this extension like that >>> because amd perfmon must only expose hw performance counters. >>> The good one is to enable it *only* if the driver expose GPU counters. >>> >>> >>> >>>>>>> + >>>>>>> /* Expose the extensions which directly correspond to gallium >>>>>>> caps. */ >>>>>>> for (i = 0; i < ARRAY_SIZE(cap_mapping); i++) { >>>>>>> if (screen->get_param(screen, cap_mapping[i].cap)) { >>>>>> >>>>>> Great work Samuel! 1-5 are R-b: Martin Peres <martin.pe...@free.fr> >>>>>> >>>>>> As for this patch, I will hold my R-b until we clear this query >>>>>> management issues. >>>>> >>>>> Thanks a lot for the review. >>>> >>>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev