On Tue, Dec 5, 2017 at 3:40 PM, Jordan Justen <jordan.l.jus...@intel.com> wrote: > On 2017-12-05 14:49:28, Mark Janes wrote: >> Jordan Justen <jordan.l.jus...@intel.com> writes: >> >> > On 2017-12-05 09:13:11, Mark Janes wrote: >> >> Jordan Justen <jordan.l.jus...@intel.com> writes: >> >> >> >> > On 2017-11-08 17:26:47, Timothy Arceri wrote: >> >> >> Reviewed-by: Timothy Arceri <tarc...@itsqueeze.com> >> >> >> >> >> >> Mark may want to consider adding some of the once a day type CI runs >> >> >> for >> >> >> this. For example running the test suite for two consecutive runs on >> >> >> the >> >> >> same build so that the second run uses the shader cache and also a >> >> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache >> >> >> fallback path. >> >> > >> >> > Yeah. We discussed this previously, but I don't think it's been >> >> > implemented yet. My opinion is that it could perhaps be a weekly test. >> >> >> >> This automation is implemented now. It found the issues reported in >> >> https://bugs.freedesktop.org/show_bug.cgi?id=103988 >> >> >> >> My opinion is that this test strategy is inadequate. >> > >> > Meaning you think we cannot enable i965 shader cache for the 18.0 >> > release? >> >> I haven't heard anyone express confidence that this feature is bug-free, >> and I don't know on what basis that claim could be made. I appreciate >> that a lot of have manual testing has been done. Do you feel confident >> that the cache can be enabled for all mesa customers without disruption? > > If we had enabled it two months ago, then yes, we would have worked > through the niche issues, or discovered the so-far-hidden major land > mine. At this point, it's getting risky. In a month, I will say no. > >> > We are already ~1.5 months away from the next stable branch point. If >> > we want to enable this in 18.0, we should be using this time to see if >> > enabling the cache by default has some large unexpected side effect in >> > our graphics stack, or breaks real-world apps. >> >> I agree. We should encourage as many users as possible to enable the >> shader cache in their environments. At least one stable release should >> be out with a working cache, where the feature can be enabled by those >> who are eager to benefit from it. I assume there will be a lot of them, >> and they could flush out issues for everyone who has no idea what a >> shader cache is. >> >> >> Adding a dimension to the test matrix has high cost, especially when >> >> combined with other dimensions of the test matrix (does shader cache >> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?). >> > >> > Are you saying this is too high cost to run per check-in? Do you need >> > to disable it for the health of CI? I think I proposed that daily, or >> > perhaps even weekly would be adequate. >> >> Certainly, the test time per line of shader cache code is massively >> higher than any other feature, even if you run it just once a month. >> Other features have tests that run in milliseconds, not 30min * 20 >> machines. > > The scope of this feature is nearly the entire API. It is justified to > throw the entire GL suite of tests at it on a regular basis. The cost > of running this once per week ought to be reasonable.
But the entire API boils down to a comparatively small set of non-orthogonal state. The configuration of those nobs seems to me like the place things are most likely to break. I think there's value in testing that we're hitting the cache, but if we're not it's not a functional regression. I'm more concerned about ensuring we don't have bugs that affect functionality and cause things to break. The program key for fragment shaders looks like: /** The program key for Fragment/Pixel Shaders. */ struct brw_wm_prog_key { /* Some collection of BRW_WM_IZ_* */ uint8_t iz_lookup; bool stats_wm:1; bool flat_shade:1; unsigned nr_color_regions:5; bool replicate_alpha:1; bool clamp_fragment_color:1; bool persample_interp:1; bool multisample_fbo:1; bool frag_coord_adds_sample_pos:1; enum brw_wm_aa_enable line_aa:2; bool high_quality_derivatives:1; bool force_dual_color_blend:1; bool coherent_fb_fetch:1; uint64_t input_slots_valid; unsigned program_string_id; GLenum alpha_test_func; /* < For Gen4/5 MRT alpha test */ float alpha_test_ref; struct brw_sampler_prog_key_data tex; }; and it's the most complex of the shader stages. Wouldn't you feel a lot safer if we just had a piglit test for each of those nobs that compiled a shader, then changed some non-orthogonal state, and then rendered with it, thus confirming that it didn't get the wrong shader program out of the cache? I know I've run into cases numerous times where piglit doesn't actually test something, or only one seemingly unrelated test in all 50 thousand tickles a real bug in my code. I feel uncomfortable assuming that piglit's existing coverage is good enough. In general, I'm still bothered by the idea of just running piglit twice. That's a very untargeted approach that doesn't fit well with our existing testing model, as Mark has described. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev