On Wed, Mar 12, 2014 at 10:36 AM, Ian Romanick <i...@freedesktop.org> wrote: > On 03/12/2014 02:51 AM, Ilia Mirkin wrote: >> On Wed, Mar 12, 2014 at 4:28 AM, Kenneth Graunke <kenn...@whitecape.org> >> wrote: >>> Ideally, we'd like to never even attempt the SIMD16 compile if we could >>> know ahead of time that it won't succeed---it's purely a waste of time. >>> This is especially important for state-based recompiles, which happen at >>> draw time. >>> >>> The fragment shader compiler has a number of checks like: >>> >>> if (dispatch_width == 16) >>> fail("...some reason..."); >>> >>> This patch introduces a new no16() function which replaces the above >>> pattern. In the SIMD8 compile, it sets a "SIMD16 will never work" flag. >>> Then, brw_wm_fs_emit can check that flag, skip the SIMD16 compile, and >>> issue a helpful performance warning if INTEL_DEBUG=perf is set. (In >>> SIMD16 mode, no16() calls fail(), for safety's sake.) >>> >>> The great part is that this is not a heuristic---if the flag is set, we >>> know with 100% certainty that the SIMD16 compile would fail. (It might >>> fail anyway if we run out of registers, but it's always worth trying.) >>> >>> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >>> --- >>> src/mesa/drivers/dri/i965/brw_fs.cpp | 69 >>> +++++++++++++++++++++++----- >>> src/mesa/drivers/dri/i965/brw_fs.h | 4 ++ >>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 44 +++++++++--------- >>> 3 files changed, 83 insertions(+), 34 deletions(-) >>> >>> I forgot to send this one out...it applies on top of the previous 7 patches. >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> index 62848be..9ad80c5 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> @@ -647,9 +647,8 @@ fs_visitor::emit_shader_time_write(enum >>> shader_time_shader_type type, >>> } >>> >>> void >>> -fs_visitor::fail(const char *format, ...) >>> +fs_visitor::vfail(const char *format, va_list va) >>> { >>> - va_list va; >>> char *msg; >>> >>> if (failed) >>> @@ -657,9 +656,7 @@ fs_visitor::fail(const char *format, ...) >>> >>> failed = true; >>> >>> - va_start(va, format); >>> msg = ralloc_vasprintf(mem_ctx, format, va); >>> - va_end(va); >>> msg = ralloc_asprintf(mem_ctx, "FS compile failed: %s\n", msg); >>> >>> this->fail_msg = msg; >>> @@ -669,6 +666,49 @@ fs_visitor::fail(const char *format, ...) >>> } >>> } >>> >>> +void >>> +fs_visitor::fail(const char *format, ...) >>> +{ >>> + va_list va; >>> + >>> + va_start(va, format); >>> + vfail(format, va); >>> + va_end(va); >>> +} >>> + >>> +/** >>> + * Mark this program as impossible to compile in SIMD16 mode. >>> + * >>> + * During the SIMD8 compile (which happens first), we can detect and flag >>> + * things that are unsupported in SIMD16 mode, so the compiler can skip >>> + * the SIMD16 compile altogether. >>> + * >>> + * During a SIMD16 compile (if one happens anyway), this just calls fail(). >>> + */ >>> +void >>> +fs_visitor::no16(const char *format, ...) >>> +{ >>> + va_list va; >>> + >>> + va_start(va, format); >>> + >>> + if (dispatch_width == 16) { >>> + vfail(format, va); >> >> I think there's a va_end() missing in this path. Not sure what the >> end-effect of that is, but I'm pretty sure that the recommendation is >> to always end the list before returning. > > I don't know about that. On some implementations, va_start and va_end > are macros that introduce a new block. You can imagine an > implementation like: > > typedef intptr_t va_list; > > #define va_start(va, last) \ > do { \ > void *__va_temp; \ > va = ((intptr_t)(void *)&last) + sizeof(last) > > #define va_arg(va, type) \ > (__va_temp = (void *) va, va += sizeof(type), *(type *) __va_temp) > > #define va_end(va) \ > } while (0) > > This exact implementation doesn't work because it doesn't align the > pointers, but the general idea is valid. > > I know that some implementations of varargs.h used to work that way. > ANSI C may require that stdarg.h be more civilized.
My va_start man page says "On some systems, va_end contains a closing '}' matching a '{' in va_start" in the NOTES section about the old varargs.h macros, so I don't think that's the case anymore. In fact, gcc defines them as: #define va_start(v,l) __builtin_va_start(v,l) #define va_end(v) __builtin_va_end(v) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev