On 6 November 2013 11:24, Erik Faye-Lund <kusmab...@gmail.com> wrote:
> On Wed, Nov 6, 2013 at 7:49 PM, Paul Berry <stereotype...@gmail.com> > wrote: > > On 6 November 2013 09:35, Erik Faye-Lund <kusmab...@gmail.com> wrote: > >> > >> Sorry for the late reply. > >> > >> On Fri, Nov 1, 2013 at 4:14 PM, Tapani <tapani.pa...@intel.com> wrote: > >> > On 11/01/2013 03:31 PM, Erik Faye-Lund wrote: > >> >> > >> >> > >> >> Won't using the git-sha1 as a compatibility-criteria cause issues for > >> >> developers with local changes? I'm not so worried about this for > >> >> OES_get_program_binary itself, but once the shader-cache is in place > >> >> it sounds like a potential source of difficult to track down > >> >> misbehavior... > >> >>>>> > >> >>>>> > >> >>>>> I agree it might be too aggressive criteria but it is hard to come > >> >>>>> up > >> >>>>> with > >> >>>>> better and as simple. > >> >>>> > >> >>>> That's not my objection. My objection is that this might give > >> >>>> headaches for people with local modifications to the glsl-compiler. > >> >>>> Local modifications does not affect the git-sha1. > >> >>> > >> >>> > >> >>> For the automatic shader cache this headache could be helped a bit > >> >>> with a > >> >>> environment variable or drirc setting that can be used during > >> >>> development. > >> >>> On the other hand an automatic cache must work in a transparent way > so > >> >>> it > >> >>> should be always able to recover when it fails, so one should only > see > >> >>> it > >> >>> as > >> >>> 'slower than usual' (since recompilation/relink required) sort of > >> >>> behaviour. > >> >>> The WIP of the automatic cache I sent some time earlier also marked > >> >>> (renamed) these 'problematic' cached shaders so that they can be > >> >>> detected > >> >>> on > >> >>> further runs and cache can ignore those. > >> >>> > >> >>> I agree that it might become problematic, on the other hand it is > also > >> >>> easy > >> >>> to just wipe ~/.cache/mesa and disable cache. > >> >> > >> >> That won't help for programs that maintain their own (explicit) > >> >> shader-cache, which was the intention of introducing binary shaders > to > >> >> OpenGL ES in the first place. > >> > > >> > > >> > Ok, we are talking about the extension, I thought you referred to the > >> > automatic caching. For extension to work, we need at least more Piglit > >> > tests > >> > to ensure that it does not break. > >> > >> I was actually of talking about both. But for the caching, it's > >> probably more forgivable, as developers probably know they changed the > >> compiler and can step around it by flushing the cache. Especially if > >> the build time gets included, like Pauls suggested. > >> > >> > Of course every time you go and touch the > >> > code, some functionality may break, be it this extension or something > >> > else. > >> > >> That's completely irrelevant here. The rejection mechanism isn't > >> intended to catch bugs, but to detect intentional format changes. So > >> let's not confuse the issue. > >> > >> > I'm not sure if Chromium, Qt or other users expect glBinaryProgram > call > >> > to > >> > always succeed, hopefully not. > >> > >> If they do, they're buggy. But there is a chance for that, yes. But > >> I'm actually talking about that they might get *told* that everything > >> went well, and still get a broken shader. Or even a crash. Of course, > >> this only applies when mesa is build with local modifications, but > >> that happens a *lot* while debugging application issues. Perhaps bugs > >> start to disappear, because applications take another non-buggy > >> code-path? It'll probably only affect developers, and not happen in > >> the wild. But I still don't think that's a good excuse. > >> > >> However, by a strict reading of the spec, I don't even yhink we're > >> allowed to break the shaders for just any reason. The wording of the > >> spec is "An implementation may reject a program binary if it > >> determines the program binary was produced by an incompatible or > >> outdated version of the compiler". The way I read that, changes that > >> doesn't modify the compiler aren't really allowed to reject previous > >> shaders. While diverging from the spec on this *might* not have many > >> real-world implications, at the very best your solution goes against > >> the intention of this rejection-mechanism. > > > > > > We have to regard any bug fix, feature improvement, or performance > > optimization that's applied to the parser, ast, optimization passes, or > > lowering passes as constituting a change to the compiler. > > Not quite. Bug-fixes are fine; if the compiled shader was buggy, > you'll have a buggy binary shader, and it should be loaded with those > bugs intact. Feature improvements also does not matter: surely, a > pre-compiled shader won't use newer features. The same goes for > performance optimization; a program that pre-compiled a shader from an > old compiler surely won't expect to get a performance boost from new > optimization passes. If so, they'll use source shaders instead, and > pay the price of re-compiling. This is *exactly* what binary shaders > are for. > > > Otherwise when > > the user upgrades Mesa, any apps that use get_program_binary will fail to > > get the advantage of the upgraded code. > > Yes, as I said, that's a part of the contract. > > > As a practical matter, nearly every > > sMesa release makes changes to one of those components, even point > releases. > > So allowing old program binaries in the event that the compiler hasn't > > changed really wouldn't produce any end user benefit. > > Even if *some* modification happened to the compiler, that's not a > good reason to reject the binary shader IMO. Only *relevant* changes. > And it's not untypical for distro-maintainers to cherry-pick patches > from time to time. And then there's bleeding-edge users. > > > And it would risk big > > developer headaches if we ever make a change that affects compilation > > without realizing it. IMHO the risks outweigh the benefits. > > As someone who has gone down this path before; it's not as bad as you > make it to be. If you attack the problem from the right angle, that > is. > > > And as far as whether we're strictly conforming to the spec, the spec is > > vague here since it doesn't define what constitutes the "version of the > > compiler" vs the "version of the driver". I think it's an entirely > > defensible interpretation to consider the compiler version and the driver > > version to be the same thing; so whenever the driver is upgraded, the > client > > application needs to be prepared for the possibility that the shader > binary > > might be rejected. > > Now you're making terminology up. The spec doesn't talk about the > driver version at all. And it's not unclear at all from the OpenGL > spec what the task of the compiler is, so no. I don't agree that it's > a defensible position in that regard. I'm not saying it isn't > defensible in *any* regard, but that's not it. > > >> Storing halfway compiled code rather than fully compiled code seems > >> quite sketchy to me, as it introduces the need for compatibility at a > >> point where flexibility is needed. The result is a program-binary that > >> 1) must still perform register allocation and scheduling when loading, > >> so the time-saving is limited, and 2) fails to reload even for trivial > >> bugfixes in unrelated code. To me, it sounds like it would be better > >> to simply not have the extension exposed than this middle-ground. I'd > >> hate to see applications having to do vendor detection for mesa to > >> disable caching that turned out to be more of a burden than a gain. > > > > > > A key piece of discussion that I feel like we're missing is the fact > that on > > i965 at least, the shader sometimes needs to be recompiled as a result of > > changes to GL state. The get_program_binary spec requires these > state-based > > recompiles to work properly, even if the GL state machine is placed in a > > state that hadn't been seen at the time the program binary was saved. > > What kind of state are we talking about? I thought all GPU vendors had > learned from NV's GeForce FX fiasco to stay away from non-trivially > patchable state-dependeny shaders... > The GL state that triggers recompiles on i965 can be found in the following structs: - brw_wm_prog_key (in src/mesa/drivers/dri/i965/brw_wm.h) for fragment shaders - brw_vs_prog_key (in src/mesa/drivers/dri/i965/brw_vs.h) for vertex shaders - brw_gs_prog_key (in src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h) for geometry shaders > > > Therefore, the program binary format needs to include the IR, since > that's > > what we use as a starting point when we do state-based recompiles. > > I think that's more an indication that i965 is a bad fit for such an > extension, rather than that the IR must be included. In fact, when we > defined the extension, discouraging state-dependent shaders were > explicitly one of the goals. > > Fundamentally redefining what the extension does for *all* > mesa-drivers sounds to me more like intel policy than what's best for > everybody. If the binary format was a driver-specific blob container > like I suggested, intel could *still* serialize the IR into it. But > everyone else won't have to suffer from it. > > > Given > > that the IR format is defined in core mesa, it makes sense that the > > serialization code for it should go in core mesa as well. > > > > I agree with you that in order to implement the full intent of the > > extension, this mechanism needs to also store the fully-compiled machine > > code for the shader. I brought this up in my comments on patch 5/6, and > it > > sounds like Tapani is on board with doing that, but he has some more > > research to do before he can implement that part of the feature. > > > > If in the future we have a back end that doesn't need to do state-based > > recompiles (or that can do state-based recompiles without needing access > to > > the IR), then we can always add a mechanism that allows the back end to > opt > > out of storing the IR in the shader binary. But since Tapani is > targeting > > i965, which needs the IR, I think it's reasonable to get that part of > things > > working first. > > > > I guess I don't understand why these needs to be two different things, > rather than en intel-specific implementation of some generic > mechanism. > > Anyway, Tapani asked me to elaborate why I thought it was a bad idea > in the first place, and I did. Hopefully, I was clear enough. >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev