On Wed, Feb 15, 2017 at 1:49 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: > On 14.02.2017 23:51, Marek Olšák wrote: >> >> On Tue, Feb 14, 2017 at 11:47 PM, Timothy Arceri <tarc...@itsqueeze.com> >> wrote: >>> >>> >>> >>> On 15/02/17 08:35, Marek Olšák wrote: >>>> >>>> >>>> On Tue, Feb 14, 2017 at 9:53 PM, Timothy Arceri <tarc...@itsqueeze.com> >>>> wrote: >>>>> >>>>> >>>>> On 15/02/17 02:14, Marek Olšák wrote: >>>>> >>>>>> On Tue, Feb 14, 2017 at 1:52 AM, Timothy Arceri >>>>>> <tarc...@itsqueeze.com> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> --- >>>>>>> src/gallium/drivers/r600/r600_pipe.c | 10 ++++++++++ >>>>>>> src/gallium/drivers/radeon/r600_pipe_common.c | 2 +- >>>>>>> src/gallium/drivers/radeon/r600_pipe_common.h | 1 + >>>>>>> src/gallium/drivers/radeonsi/si_pipe.c | 11 +++++++++++ >>>>>>> src/gallium/include/pipe/p_screen.h | 3 +++ >>>>>>> src/mesa/state_tracker/st_context.c | 2 ++ >>>>>>> 6 files changed, 28 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/src/gallium/drivers/r600/r600_pipe.c >>>>>>> b/src/gallium/drivers/r600/r600_pipe.c >>>>>>> index 5290f40..bdd8c0a 100644 >>>>>>> --- a/src/gallium/drivers/r600/r600_pipe.c >>>>>>> +++ b/src/gallium/drivers/r600/r600_pipe.c >>>>>>> @@ -600,6 +600,7 @@ static void r600_destroy_screen(struct >>>>>>> pipe_screen* >>>>>>> pscreen) >>>>>>> compute_memory_pool_delete(rscreen->global_pool); >>>>>>> } >>>>>>> >>>>>>> + disk_cache_destroy(rscreen->b.b.disk_shader_cache); >>>>>>> r600_destroy_common_screen(&rscreen->b); >>>>>>> } >>>>>>> >>>>>>> @@ -633,6 +634,15 @@ struct pipe_screen *r600_screen_create(struct >>>>>>> radeon_winsys *ws) >>>>>>> return NULL; >>>>>>> } >>>>>>> >>>>>>> + uint32_t mesa_timestamp; >>>>>>> + if (disk_cache_get_function_timestamp(r600_screen_create, >>>>>>> &mesa_timestamp)) { >>>>>>> + char *timestamp_str; >>>>>>> + if (asprintf(×tamp_str, "%u", mesa_timestamp) != >>>>>>> -1) >>>>>>> { >>>>>>> + rscreen->b.b.disk_shader_cache = >>>>>>> disk_cache_create(r600_get_chip_name(&rscreen->b), timestamp_str); >>>>>>> + free(timestamp_str); >>>>>>> + } >>>>>>> + } >>>>>>> + >>>>>>> if (rscreen->b.info.chip_class >= EVERGREEN) { >>>>>>> rscreen->b.b.is_format_supported = >>>>>>> evergreen_is_format_supported; >>>>>>> } else { >>>>>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c >>>>>>> b/src/gallium/drivers/radeon/r600_pipe_common.c >>>>>>> index 95a6a48..4e5582f 100644 >>>>>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c >>>>>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c >>>>>>> @@ -722,7 +722,7 @@ static const char* r600_get_device_vendor(struct >>>>>>> pipe_screen* pscreen) >>>>>>> return "AMD"; >>>>>>> } >>>>>>> >>>>>>> -static const char* r600_get_chip_name(struct r600_common_screen >>>>>>> *rscreen) >>>>>>> +const char* r600_get_chip_name(struct r600_common_screen *rscreen) >>>>>>> { >>>>>>> switch (rscreen->info.family) { >>>>>>> case CHIP_R600: return "AMD R600"; >>>>>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h >>>>>>> b/src/gallium/drivers/radeon/r600_pipe_common.h >>>>>>> index 6eff9aa..0449d4d 100644 >>>>>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.h >>>>>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h >>>>>>> @@ -765,6 +765,7 @@ void radeon_save_cs(struct radeon_winsys *ws, >>>>>>> struct >>>>>>> radeon_winsys_cs *cs, >>>>>>> struct radeon_saved_cs *saved); >>>>>>> void radeon_clear_saved_cs(struct radeon_saved_cs *saved); >>>>>>> bool r600_check_device_reset(struct r600_common_context *rctx); >>>>>>> +const char* r600_get_chip_name(struct r600_common_screen *rscreen); >>>>>>> >>>>>>> /* r600_gpu_load.c */ >>>>>>> void r600_gpu_load_kill_thread(struct r600_common_screen >>>>>>> *rscreen); >>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c >>>>>>> b/src/gallium/drivers/radeonsi/si_pipe.c >>>>>>> index 853d850..0bb95b1 100644 >>>>>>> --- a/src/gallium/drivers/radeonsi/si_pipe.c >>>>>>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c >>>>>>> @@ -712,6 +712,7 @@ static void si_destroy_screen(struct pipe_screen* >>>>>>> pscreen) >>>>>>> } >>>>>>> } >>>>>>> pipe_mutex_destroy(sscreen->shader_parts_mutex); >>>>>>> + disk_cache_destroy(sscreen->b.b.disk_shader_cache); >>>>>>> si_destroy_shader_cache(sscreen); >>>>>>> r600_destroy_common_screen(&sscreen->b); >>>>>>> } >>>>>>> @@ -801,6 +802,16 @@ struct pipe_screen >>>>>>> *radeonsi_screen_create(struct >>>>>>> radeon_winsys *ws) >>>>>>> return NULL; >>>>>>> } >>>>>>> >>>>>>> + uint32_t mesa_timestamp, llvm_timestamp; >>>>>>> + if (disk_cache_get_function_timestamp(radeonsi_screen_create, >>>>>>> &mesa_timestamp) && >>>>>>> + >>>>>>> disk_cache_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo, >>>>>>> &llvm_timestamp)) { >>>>>>> + char *timestamp_str; >>>>>>> + if (asprintf(×tamp_str, "%u_%u", mesa_timestamp, >>>>>>> llvm_timestamp) != -1) { >>>>>>> + sscreen->b.b.disk_shader_cache = >>>>>>> disk_cache_create(r600_get_chip_name(&sscreen->b), timestamp_str); >>>>>> >>>>>> >>>>>> Can you please at least make some effort to get close to 80 chars per >>>>>> line even though it's not strictly required in this driver? >>>>>> >>>>>>> + free(timestamp_str); >>>>>>> + } >>>>>>> + } >>>>>>> + >>>>>>> si_handle_env_var_force_family(sscreen); >>>>>>> >>>>>>> if (!debug_get_bool_option("RADEON_DISABLE_PERFCOUNTERS", >>>>>>> false)) >>>>>>> diff --git a/src/gallium/include/pipe/p_screen.h >>>>>>> b/src/gallium/include/pipe/p_screen.h >>>>>>> index b6203f1..6fd527f 100644 >>>>>>> --- a/src/gallium/include/pipe/p_screen.h >>>>>>> +++ b/src/gallium/include/pipe/p_screen.h >>>>>>> @@ -42,6 +42,7 @@ >>>>>>> #include "pipe/p_format.h" >>>>>>> #include "pipe/p_defines.h" >>>>>>> #include "pipe/p_video_enums.h" >>>>>>> +#include "util/disk_cache.h" >>>>>>> >>>>>>> >>>>>>> >>>>>>> @@ -318,6 +319,8 @@ struct pipe_screen { >>>>>>> const void *(*get_compiler_options)(struct pipe_screen *screen, >>>>>>> enum pipe_shader_ir ir, >>>>>>> unsigned shader); >>>>>>> + >>>>>>> + struct disk_cache *disk_shader_cache; >>>>>>> }; >>>>>> >>>>>> >>>>>> Changes to gallium/include/pipe must be separate commits. We don't >>>>>> change this interface unless there is a very good reason to do so. You >>>>>> might have to get Brian's or Roland's ack if you wanna pursue this >>>>>> change, though the likelihood of acceptance is pretty low on this one. >>>>>> >>>>>> However you can ignore all the above, because there is a different >>>>>> issue to discuss. If I understand it correctly, this only caches TGSI. >>>>>> What's the point in putting the cache into the driver in that case? >>>>> >>>>> >>>>> I was trying to land the glsl ir and tgsi pieces and avoid getting >>>>> stuck >>>>> discussing the GCN cache patch at this point, there will defiantly be a >>>>> GCN >>>>> cache. >>>>>> >>>>>> >>>>>> If >>>>>> the goal is to have a GLSL->TGSI cache, it should live in st/mesa. If >>>>>> the goal is to have a GLSL->GCN cache, it might live in the driver. >>>>>> However, even that is not necessary if you just expose shader binaries >>>>>> to st/mesa, in which case st/mesa can query binaries from the driver >>>>>> and the GLSL->GCN cache can live entirely in st/mesa. If you need the >>>>>> lib timestamp in st/mesa, you can also add a query to pipe_screen >>>>>> returning the lib timestamp only. Then you'll have both native shader >>>>>> binaries and the driver compiler timestamp exposed to st/mesa. So I'd >>>>>> like to understand why the cache is in the driver. >>>>> >>>>> >>>>> This might work ok for radeonsi as you don't really have to worry about >>>>> variants, but what about other drivers? It would seem odd to add >>>>> something >>>>> to the st if its only used by a single driver wouldn't it? Happy to be >>>>> wrong >>>>> about this, but this was my thought process here. >>>> >>>> >>>> OK. That's a fair point. Nouveau and radeonsi don't need a cache in >>>> the driver, > > > > Not sure what you're saying here. Having the GCN in the cache would > certainly help. > > That said, implementing this in the state tracker via an interface like > below makes sense to me. > > >>>> but other drivers like r600g might. However, r600g >>>> compilation is so much faster than radeonsi, so it might not be that >>>> important. Also given the rate of r600g development, I don't see a >>>> native shader cache happening there. What other drivers are there >>>> really? We have freedreno and vc4 as the other two actively developed >>>> drivers. So you can ask Rob or Eric if they need a shader cache inside >>>> the driver, and if not, well, there is no point in putting the cache >>>> instance into drivers. >>> >>> >>> I asked Eric about this a while ago and while he hadn't thought about it >>> too >>> much he was thinking he would cache NIR. This is an interesting approach >>> and >>> something that I avoided in i965 because caching nir seemed painful. >>> Anyway >>> I'll point them to this thread for their comments. >>> >>>> >>>>> Also its not clear to me where I can store the binary in the st, that >>>>> is >>>>> acceptable and accessible by the driver, any advice on this would be >>>>> appreciated. >>>> >>>> >>>> The st is never accessible from the driver. >>> >>> >>> >>> Sure what I mean is if we load the radeonsi/nouveau binary in the st at >>> link >>> time, how would you pass that down to the driver? >> >> >> I would add an interface for creating a shader from a binary. So we have >> this: >> - pipe_screen::get_compiler_timestamp > > > I think this should return a string (and have a slightly different name) so > we have the freedom to add additional information when required. I'm > thinking GPU type (though that's also in the screen name) and R600_DEBUG > flags that affect compilation.
I talked with Timothy about this on IRC and the current plan is to keep the on-disk cache inside gallium drivers. It might directly interact with our in-memory cache, i.e. all in-memory cache misses will fall back to the on-disk cache, which will load shaders from the disk to the in-memory cache, and the in-memory cache will return a cache hit. All shaders will stay in the in-memory cache after that. Note that the on-disk cache shouldn't keep loaded shaders in memory, because the in-memory cache will do that. > > >> - pipe_context::get_shader_binary >> - pipe_context::create_shader_from_binary > > > Makes sense, but there's an open question: How do we store (optimized) > variants in the cache? I admit they're not as important for the raw loading > time, but it would be nice not to have the compiler threads busy during the > first seconds of a game, either. > > We don't have to solve that all at once, but it would be nice to have an > interface that can be extended easily. Adding optimized variants into the on-disk cache could add a lot of complexity into the existing code but what would the benefit be? If the async compilation is a problem, we can either decrease the number of threads, or decrease the thread priority, or both (IMO). Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev