[Mesa-dev] [PATCH 0/3 v3] clLinkProgram
This serie add OpenCL 1.2 clLinkProgram. However, it lacks the binary type part that is mandatory for input validation and also for CL_PROGRAM_BINARY_TYPE query. This will be adressed in another serie once we agree on the way to store it. EdB (3): clover: separate compile and link stages clover: override ret_object clover: add clLinkProgramm for CL 1.2 src/gallium/state_trackers/clover/api/dispatch.cpp | 2 +- src/gallium/state_trackers/clover/api/program.cpp | 37 ++- src/gallium/state_trackers/clover/api/util.hpp | 12 + .../state_trackers/clover/core/compiler.hpp| 7 +- src/gallium/state_trackers/clover/core/error.hpp | 21 ++ src/gallium/state_trackers/clover/core/program.cpp | 97 ++- src/gallium/state_trackers/clover/core/program.hpp | 11 +- .../state_trackers/clover/llvm/invocation.cpp | 281 +++-- 8 files changed, 372 insertions(+), 96 deletions(-) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] clover: override ret_object
Return an API object from an intrusive smart reference Clover object, incrementing the reference count of the object. --- src/gallium/state_trackers/clover/api/util.hpp | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/state_trackers/clover/api/util.hpp b/src/gallium/state_trackers/clover/api/util.hpp index 918df61..6af28f2 100644 --- a/src/gallium/state_trackers/clover/api/util.hpp +++ b/src/gallium/state_trackers/clover/api/util.hpp @@ -61,6 +61,18 @@ namespace clover { *p = desc(v()); } } + + /// + /// Return an API object from an intrusive smart reference Clover object, + /// incrementing the reference count of the object. + /// + template + typename T::descriptor_type * + ret_object(const intrusive_ref &v) { + v().retain(); + return desc(v()); + } + } #endif -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3 v3] clLinkProgram
(resending because git send-email crashed) This serie add OpenCL 1.2 clLinkProgram. However, it lacks the binary type part that is mandatory for input validation and also for CL_PROGRAM_BINARY_TYPE query. This will be adressed in another serie once we agree on the way to store it. EdB (3): clover: separate compile and link stages clover: override ret_object clover: add clLinkProgramm for CL 1.2 src/gallium/state_trackers/clover/api/dispatch.cpp | 2 +- src/gallium/state_trackers/clover/api/program.cpp | 37 ++- src/gallium/state_trackers/clover/api/util.hpp | 12 + .../state_trackers/clover/core/compiler.hpp| 7 +- src/gallium/state_trackers/clover/core/error.hpp | 21 ++ src/gallium/state_trackers/clover/core/program.cpp | 97 ++- src/gallium/state_trackers/clover/core/program.hpp | 11 +- .../state_trackers/clover/llvm/invocation.cpp | 281 +++-- 8 files changed, 372 insertions(+), 96 deletions(-) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] clover: separate compile and link stages
--- src/gallium/state_trackers/clover/api/program.cpp | 6 +- .../state_trackers/clover/core/compiler.hpp| 7 +- src/gallium/state_trackers/clover/core/error.hpp | 21 ++ src/gallium/state_trackers/clover/core/program.cpp | 93 ++- src/gallium/state_trackers/clover/core/program.hpp | 10 +- .../state_trackers/clover/llvm/invocation.cpp | 281 +++-- 6 files changed, 323 insertions(+), 95 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/program.cpp b/src/gallium/state_trackers/clover/api/program.cpp index e9b1f38..2441d81 100644 --- a/src/gallium/state_trackers/clover/api/program.cpp +++ b/src/gallium/state_trackers/clover/api/program.cpp @@ -184,10 +184,6 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs, prog.build(devs, opts); return CL_SUCCESS; } catch (error &e) { - if (e.get() == CL_INVALID_COMPILER_OPTIONS) - return CL_INVALID_BUILD_OPTIONS; - if (e.get() == CL_COMPILE_PROGRAM_FAILURE) - return CL_BUILD_PROGRAM_FAILURE; return e.get(); } @@ -224,7 +220,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, range(header_names, num_headers), objs(d_header_progs, num_headers)); - prog.build(devs, opts, headers); + prog.compile(devs, opts, headers); return CL_SUCCESS; } catch (error &e) { diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp b/src/gallium/state_trackers/clover/core/compiler.hpp index c68aa39..31fb6ee 100644 --- a/src/gallium/state_trackers/clover/core/compiler.hpp +++ b/src/gallium/state_trackers/clover/core/compiler.hpp @@ -32,11 +32,16 @@ namespace clover { module compile_program_llvm(const std::string &source, const header_map &headers, - pipe_shader_ir ir, const std::string &target, const std::string &opts, std::string &r_log); + module link_program_llvm(const std::vector &modules, +enum pipe_shader_ir ir, +const std::string &target, +const std::string &opts, +std::string &r_log); + module compile_program_tgsi(const std::string &source); } diff --git a/src/gallium/state_trackers/clover/core/error.hpp b/src/gallium/state_trackers/clover/core/error.hpp index 780b973..3c1bf90 100644 --- a/src/gallium/state_trackers/clover/core/error.hpp +++ b/src/gallium/state_trackers/clover/core/error.hpp @@ -68,10 +68,31 @@ namespace clover { class build_error : public error { public: build_error(const std::string &what = "") : + error(CL_BUILD_PROGRAM_FAILURE, what) { + } + }; + + class compile_error : public error { + public: + compile_error(const std::string &what = "") : error(CL_COMPILE_PROGRAM_FAILURE, what) { } }; + class link_error : public error { + public: + link_error(const std::string &what = "") : + error(CL_LINK_PROGRAM_FAILURE, what) { + } + }; + + class link_option_error : public error { + public: + link_option_error(const std::string &what = "") : + error(CL_INVALID_LINKER_OPTIONS , what) { + } + }; + template class invalid_object_error; diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp index 0d6cc40..21faf4e 100644 --- a/src/gallium/state_trackers/clover/core/program.cpp +++ b/src/gallium/state_trackers/clover/core/program.cpp @@ -40,15 +40,37 @@ program::program(clover::context &ctx, } void -program::build(const ref_vector &devs, const char *opts, - const header_map &headers) { +program::build(const ref_vector &devs, const char *opts) { + + if (has_source) { + try { + compile(devs, opts, {}); + if (!link(devs, opts, {*this}, true)) +throw error(CL_BUILD_PROGRAM_FAILURE); + } catch (error &e) { + switch (e.get()) { +case CL_INVALID_COMPILER_OPTIONS: +case CL_INVALID_LINKER_OPTIONS: + e = error(CL_INVALID_BUILD_OPTIONS); + break; +case CL_COMPILE_PROGRAM_FAILURE: +case CL_LINK_PROGRAM_FAILURE: + e = error(CL_BUILD_PROGRAM_FAILURE); + break; + } + throw; + } + } +} + +void +program::compile(const ref_vector &devs, const char *opts, + const header_map &headers) { if (has_source) { _devices = devs; for (auto &dev : devs) { - _binaries.erase(&dev); - _logs.erase(&dev); - _opts.erase(&dev); + clean(&dev); _opts.insert({ &dev, opts }); @@ -58,12 +80,11 @@ program::build(const ref_vector &devs, const char *opts, auto module = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? compile_pro
[Mesa-dev] [PATCH 3/3] clover: add clLinkProgramm for CL 1.2
--- src/gallium/state_trackers/clover/api/dispatch.cpp | 2 +- src/gallium/state_trackers/clover/api/program.cpp | 31 ++ src/gallium/state_trackers/clover/core/program.cpp | 4 +++ src/gallium/state_trackers/clover/core/program.hpp | 1 + 4 files changed, 37 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp b/src/gallium/state_trackers/clover/api/dispatch.cpp index b5a4094..44bff4f 100644 --- a/src/gallium/state_trackers/clover/api/dispatch.cpp +++ b/src/gallium/state_trackers/clover/api/dispatch.cpp @@ -123,7 +123,7 @@ namespace clover { clCreateImage, clCreateProgramWithBuiltInKernels, clCompileProgram, - NULL, // clLinkProgram + clLinkProgram, clUnloadPlatformCompiler, NULL, // clGetKernelArgInfo NULL, // clEnqueueFillBuffer diff --git a/src/gallium/state_trackers/clover/api/program.cpp b/src/gallium/state_trackers/clover/api/program.cpp index 2441d81..d532b66 100644 --- a/src/gallium/state_trackers/clover/api/program.cpp +++ b/src/gallium/state_trackers/clover/api/program.cpp @@ -227,6 +227,37 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, return e.get(); } +CLOVER_API cl_program +clLinkProgram (cl_context d_ctx, cl_uint num_devs, const cl_device_id *d_devs, + const char *p_opts, cl_uint num_progs, const cl_program *d_progs, + void (*pfn_notify) (cl_program, void *), void *user_data, + cl_int *r_errcode) try { + auto &ctx = obj(d_ctx); + auto devs = (d_devs ? objs(d_devs, num_devs) : +ref_vector(ctx.devices())); + auto opts = (p_opts ? p_opts : ""); + auto progs = objs(d_progs, num_progs); + + if ((!pfn_notify && user_data)) + throw error(CL_INVALID_VALUE); + + if (any_of([&](const device &dev) { +return !count(dev, ctx.devices()); + }, objs(d_devs, num_devs))) + throw error(CL_INVALID_DEVICE); + + auto prog = create(ctx); + if (prog().link(devs, opts, progs)) + *r_errcode = CL_SUCCESS; + else + *r_errcode = CL_LINK_PROGRAM_FAILURE; + + return ret_object(prog); +} catch (error &e) { + ret_error(r_errcode, e); + return NULL; +} + CLOVER_API cl_int clUnloadCompiler() { return CL_SUCCESS; diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp index 21faf4e..825537f 100644 --- a/src/gallium/state_trackers/clover/core/program.cpp +++ b/src/gallium/state_trackers/clover/core/program.cpp @@ -24,6 +24,10 @@ using namespace clover; +program::program(clover::context &ctx) : + has_source(false), context(ctx), _kernel_ref_counter(0) { +} + program::program(clover::context &ctx, const std::string &source) : has_source(true), context(ctx), _source(source), _kernel_ref_counter(0) { } diff --git a/src/gallium/state_trackers/clover/core/program.hpp b/src/gallium/state_trackers/clover/core/program.hpp index 314979b..a1036b7 100644 --- a/src/gallium/state_trackers/clover/core/program.hpp +++ b/src/gallium/state_trackers/clover/core/program.hpp @@ -37,6 +37,7 @@ namespace clover { evals, const std::vector> &> device_range; public: + program(clover::context &ctx); program(clover::context &ctx, const std::string &source); program(clover::context &ctx, -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2] mesa: fix active sampler conflict validation
The type stored in gl_uniform_storage is the type of a single array element not the array type so size was always 1. Use the number of array elements stored in the gl_uniform_storage instead. V2: Dont validate sampler units pointing to 0 --- Fixes new piglit test: http://lists.freedesktop.org/archives/piglit/2015-July/016452.html src/mesa/main/uniform_query.cpp | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 680ba16..036530e 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -1026,18 +1026,23 @@ _mesa_sampler_uniforms_pipeline_are_valid(struct gl_pipeline_object *pipeline) for (unsigned i = 0; i < shProg[idx]->NumUniformStorage; i++) { const struct gl_uniform_storage *const storage = &shProg[idx]->UniformStorage[i]; - const glsl_type *const t = (storage->type->is_array()) -? storage->type->fields.array : storage->type; - if (!t->is_sampler()) + if (!storage->type->is_sampler()) continue; active_samplers++; - const unsigned count = MAX2(1, storage->type->array_size()); + const unsigned count = MAX2(1, storage->array_elements); for (unsigned j = 0; j < count; j++) { const unsigned unit = storage->storage[j].i; +/* FIXME: Samplers are initialized to 0 and Mesa doesn't do a + * great job of eliminating unused uniforms currently so for now + * don't throw an error if two sampler types both point to 0. + */ +if (unit == 0) + continue; + /* The types of the samplers associated with a particular texture * unit must be an exact match. Page 74 (page 89 of the PDF) of * the OpenGL 3.3 core spec says: @@ -1047,13 +1052,14 @@ _mesa_sampler_uniforms_pipeline_are_valid(struct gl_pipeline_object *pipeline) * program object." */ if (unit_types[unit] == NULL) { - unit_types[unit] = t; -} else if (unit_types[unit] != t) { + unit_types[unit] = storage->type; +} else if (unit_types[unit] != storage->type) { pipeline->InfoLog = ralloc_asprintf(pipeline, "Texture unit %d is accessed both as %s " "and %s", - unit, unit_types[unit]->name, t->name); + unit, unit_types[unit]->name, + storage->type->name); return false; } } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC] gallium: add interface for writable shader images
From: Marek Olšák Other approaches are being considered: 1) Don't use resource wrappers (views) and pass all view parameters (format, layer range, level) to set_shader_images just like set_vertex_buffers, set_constant_buffer, or even glBindImageTexture do. 2) Use pipe_sampler_view instead of pipe_image_view, and maybe even use set_sampler_views instead of set_shader_images. set_sampler_views would have to use start_slot >= PIPE_MAX_SAMPLERS for all writable images to allow for OpenGL textures in the lower slots. I wouldn't like to go back to using pipe_surface, because it doesn't fit into the DX12 hw design. Please discuss. --- src/gallium/auxiliary/util/u_debug_describe.c | 9 +++ src/gallium/auxiliary/util/u_debug_describe.h | 2 ++ src/gallium/auxiliary/util/u_dump.h | 3 +++ src/gallium/auxiliary/util/u_dump_state.c | 27 + src/gallium/auxiliary/util/u_inlines.h| 10 src/gallium/docs/source/context.rst | 16 ++-- src/gallium/drivers/hangdump/hd_context.c | 3 ++- src/gallium/drivers/ilo/ilo_state.c | 10 +--- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 10 +--- src/gallium/include/pipe/p_context.h | 35 ++- src/gallium/include/pipe/p_state.h| 25 +++ 11 files changed, 122 insertions(+), 28 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug_describe.c b/src/gallium/auxiliary/util/u_debug_describe.c index df73ed8..f428d22 100644 --- a/src/gallium/auxiliary/util/u_debug_describe.c +++ b/src/gallium/auxiliary/util/u_debug_describe.c @@ -81,6 +81,15 @@ debug_describe_sampler_view(char* buf, const struct pipe_sampler_view *ptr) } void +debug_describe_image_view(char* buf, const struct pipe_image_view *ptr) +{ + char res[128]; + debug_describe_resource(res, ptr->resource); + util_sprintf(buf, "pipe_image_view<%s,%s>", res, +util_format_short_name(ptr->format)); +} + +void debug_describe_so_target(char* buf, const struct pipe_stream_output_target *ptr) { diff --git a/src/gallium/auxiliary/util/u_debug_describe.h b/src/gallium/auxiliary/util/u_debug_describe.h index 4f7882b..2172ecb 100644 --- a/src/gallium/auxiliary/util/u_debug_describe.h +++ b/src/gallium/auxiliary/util/u_debug_describe.h @@ -35,12 +35,14 @@ struct pipe_reference; struct pipe_resource; struct pipe_surface; struct pipe_sampler_view; +struct pipe_image_view; /* a 256-byte buffer is necessary and sufficient */ void debug_describe_reference(char* buf, const struct pipe_reference*ptr); void debug_describe_resource(char* buf, const struct pipe_resource *ptr); void debug_describe_surface(char* buf, const struct pipe_surface *ptr); void debug_describe_sampler_view(char* buf, const struct pipe_sampler_view *ptr); +void debug_describe_image_view(char* buf, const struct pipe_image_view *ptr); void debug_describe_so_target(char* buf, const struct pipe_stream_output_target *ptr); diff --git a/src/gallium/auxiliary/util/u_dump.h b/src/gallium/auxiliary/util/u_dump.h index 1d279ef..4304613 100644 --- a/src/gallium/auxiliary/util/u_dump.h +++ b/src/gallium/auxiliary/util/u_dump.h @@ -156,6 +156,9 @@ void util_dump_sampler_view(FILE *stream, const struct pipe_sampler_view *state); void +util_dump_image_view(FILE *stream, const struct pipe_image_view *state); + +void util_dump_transfer(FILE *stream, const struct pipe_transfer *state); diff --git a/src/gallium/auxiliary/util/u_dump_state.c b/src/gallium/auxiliary/util/u_dump_state.c index 7784eb9..be0b1b2 100644 --- a/src/gallium/auxiliary/util/u_dump_state.c +++ b/src/gallium/auxiliary/util/u_dump_state.c @@ -708,6 +708,33 @@ util_dump_sampler_view(FILE *stream, const struct pipe_sampler_view *state) void +util_dump_image_view(FILE *stream, const struct pipe_image_view *state) +{ + if (!state) { + util_dump_null(stream); + return; + } + + util_dump_struct_begin(stream, "pipe_image_view"); + + util_dump_member(stream, ptr, state, resource); + util_dump_member(stream, format, state, format); + + if (state->resource->target == PIPE_BUFFER) { + util_dump_member(stream, uint, state, u.buf.first_element); + util_dump_member(stream, uint, state, u.buf.last_element); + } + else { + util_dump_member(stream, uint, state, u.tex.first_layer); + util_dump_member(stream, uint, state, u.tex.last_layer); + util_dump_member(stream, uint, state, u.tex.level); + } + + util_dump_struct_end(stream); +} + + +void util_dump_transfer(FILE *stream, const struct pipe_transfer *state) { if(!state) { diff --git a/src/gallium/auxiliary/util/u_inlines.h b/src/gallium/auxiliary/util/u_inlines.h index 9540162..661a949 100644 --- a/src/gallium/auxiliary/util/u_inlines.h +++ b/src/gallium/auxiliary/util/u_inlines.h @@ -173,6 +173,16 @@ pipe_sampler_view_
Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images
On Sun, Jul 5, 2015 at 9:25 AM, Marek Olšák wrote: > From: Marek Olšák > > Other approaches are being considered: > > 1) Don't use resource wrappers (views) and pass all view parameters >(format, layer range, level) to set_shader_images just like >set_vertex_buffers, set_constant_buffer, or even glBindImageTexture do. > > 2) Use pipe_sampler_view instead of pipe_image_view, >and maybe even use set_sampler_views instead of set_shader_images. >set_sampler_views would have to use start_slot >= PIPE_MAX_SAMPLERS for >all writable images to allow for OpenGL textures in the lower slots. > > I wouldn't like to go back to using pipe_surface, because it doesn't fit > into the DX12 hw design. > > Please discuss. > --- > diff --git a/src/gallium/include/pipe/p_context.h > b/src/gallium/include/pipe/p_context.h > index c2eedf8..13f3938 100644 > --- a/src/gallium/include/pipe/p_context.h > +++ b/src/gallium/include/pipe/p_context.h > @@ -48,6 +48,7 @@ struct pipe_depth_stencil_alpha_state; > struct pipe_draw_info; > struct pipe_fence_handle; > struct pipe_framebuffer_state; > +struct pipe_image_view; > struct pipe_index_buffer; > struct pipe_query; > struct pipe_poly_stipple; > @@ -236,20 +237,21 @@ struct pipe_context { >const float default_inner_level[2]); > > /** > -* Bind an array of shader resources that will be used by the > -* graphics pipeline. Any resources that were previously bound to > -* the specified range will be unbound after this call. > +* Bind an array of images that will be used by a shader. > +* Any imagse that were previously bound to the specified range > +* will be unbound. > * > -* \param start first resource to bind. > -* \param count number of consecutive resources to bind. > -* \param resources array of pointers to the resources to bind, it > +* \param shader shader stage where the images will be bound. > +* \param start_slot first iamage slot to bind. > +* \param count number of consecutive images to bind. > +* \param buffersarray of pointers to the images to bind, it > * should contain at least \a count elements > -* unless it's NULL, in which case no new > -* resources will be bound. > +* unless it's NULL, in which case no images will > +* be bound. > */ > - void (*set_shader_resources)(struct pipe_context *, > -unsigned start, unsigned count, > -struct pipe_surface **resources); > + void (*set_shader_images)(struct pipe_context *, unsigned shader, > + unsigned start_slot, unsigned count, > + struct pipe_image_view *images); On Fermi, there is only one set of 8 images that can be bound, irrespective of stage. There are a different set of 8 bindings for 3d and compute pipelines though. Kepler+ is all bindless so it can do whatever. Even though I originally advocated for the stage argument, it appears to neither match Fermi hardware nor the GL API where I believe images are also in a global namespace. > > void (*set_vertex_buffers)( struct pipe_context *, > unsigned start_slot, > @@ -392,6 +394,17 @@ struct pipe_context { > struct pipe_surface *); > > /** > +* Create an image view into a buffer or texture to be used with load, > +* store, and atomic instructions by a shader stage. > +*/ > + struct pipe_image_view * (*create_image_view)(struct pipe_context *ctx, > + struct pipe_resource > *texture, > + const struct > pipe_image_view *templat); > + > + void (*image_view_destroy)(struct pipe_context *ctx, > + struct pipe_image_view *view); > + > + /** > * Map a resource. > * > * Transfers are (by default) context-private and allow uploads to be > diff --git a/src/gallium/include/pipe/p_state.h > b/src/gallium/include/pipe/p_state.h > index dc8f550..f655dda 100644 > --- a/src/gallium/include/pipe/p_state.h > +++ b/src/gallium/include/pipe/p_state.h > @@ -386,6 +386,31 @@ struct pipe_sampler_view > > > /** > + * A view into a writable buffer or texture that can be bound to a shader > + * stage. > + */ > +struct pipe_image_view > +{ > + struct pipe_reference reference; > + struct pipe_resource *resource; /**< resource into which this is a view > */ > + struct pipe_context *context; /**< context this view belongs to */ > + enum pipe_format format; /**< typed PIPE_FORMAT_x */ > + > + union { > + struct { > + unsigned first_layer:16; /**< first layer to use for array > textures */ > + unsigned last_layer:16; /**< last layer to use for array > text
Re: [Mesa-dev] [PATCH 1/3] clover: separate compile and link stages
Hi EdB, a bunch of comments inline, EdB writes: > --- > src/gallium/state_trackers/clover/api/program.cpp | 6 +- > .../state_trackers/clover/core/compiler.hpp| 7 +- > src/gallium/state_trackers/clover/core/error.hpp | 21 ++ > src/gallium/state_trackers/clover/core/program.cpp | 93 ++- > src/gallium/state_trackers/clover/core/program.hpp | 10 +- > .../state_trackers/clover/llvm/invocation.cpp | 281 > +++-- > 6 files changed, 323 insertions(+), 95 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/program.cpp > b/src/gallium/state_trackers/clover/api/program.cpp > index e9b1f38..2441d81 100644 > --- a/src/gallium/state_trackers/clover/api/program.cpp > +++ b/src/gallium/state_trackers/clover/api/program.cpp > @@ -184,10 +184,6 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs, > prog.build(devs, opts); I don't think there's any reason to keep the program::build method around anymore, it's only going to be called from this entry point so you could as well make the two function calls to ::compile() and ::link() directly from here. > return CL_SUCCESS; > } catch (error &e) { > - if (e.get() == CL_INVALID_COMPILER_OPTIONS) > - return CL_INVALID_BUILD_OPTIONS; > - if (e.get() == CL_COMPILE_PROGRAM_FAILURE) > - return CL_BUILD_PROGRAM_FAILURE; > return e.get(); > } > > @@ -224,7 +220,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, >range(header_names, num_headers), >objs(d_header_progs, num_headers)); > > - prog.build(devs, opts, headers); > + prog.compile(devs, opts, headers); > return CL_SUCCESS; > > } catch (error &e) { > diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp > b/src/gallium/state_trackers/clover/core/compiler.hpp > index c68aa39..31fb6ee 100644 > --- a/src/gallium/state_trackers/clover/core/compiler.hpp > +++ b/src/gallium/state_trackers/clover/core/compiler.hpp > @@ -32,11 +32,16 @@ namespace clover { > > module compile_program_llvm(const std::string &source, > const header_map &headers, > - pipe_shader_ir ir, > const std::string &target, > const std::string &opts, > std::string &r_log); > > + module link_program_llvm(const std::vector &modules, > +enum pipe_shader_ir ir, > +const std::string &target, > +const std::string &opts, > +std::string &r_log); > + > module compile_program_tgsi(const std::string &source); > } > > diff --git a/src/gallium/state_trackers/clover/core/error.hpp > b/src/gallium/state_trackers/clover/core/error.hpp > index 780b973..3c1bf90 100644 > --- a/src/gallium/state_trackers/clover/core/error.hpp > +++ b/src/gallium/state_trackers/clover/core/error.hpp > @@ -68,10 +68,31 @@ namespace clover { > class build_error : public error { > public: >build_error(const std::string &what = "") : > + error(CL_BUILD_PROGRAM_FAILURE, what) { > + } > + }; > + This exception class now seems redundant -- With program::build() gone build is no longer a thing. > + class compile_error : public error { > + public: > + compile_error(const std::string &what = "") : > error(CL_COMPILE_PROGRAM_FAILURE, what) { >} > }; > > + class link_error : public error { > + public: > + link_error(const std::string &what = "") : > + error(CL_LINK_PROGRAM_FAILURE, what) { > + } > + }; > + > + class link_option_error : public error { > + public: > + link_option_error(const std::string &what = "") : > + error(CL_INVALID_LINKER_OPTIONS , what) { > + } > + }; > + I don't think you really need to special-case link_option_error against the less specific clover::error class? > template > class invalid_object_error; > > diff --git a/src/gallium/state_trackers/clover/core/program.cpp > b/src/gallium/state_trackers/clover/core/program.cpp > index 0d6cc40..21faf4e 100644 > --- a/src/gallium/state_trackers/clover/core/program.cpp > +++ b/src/gallium/state_trackers/clover/core/program.cpp > @@ -40,15 +40,37 @@ program::program(clover::context &ctx, > } > > void > -program::build(const ref_vector &devs, const char *opts, > - const header_map &headers) { > +program::build(const ref_vector &devs, const char *opts) { > + Unnecessary whitespace. > + if (has_source) { > + try { > + compile(devs, opts, {}); > + if (!link(devs, opts, {*this}, true)) > +throw error(CL_BUILD_PROGRAM_FAILURE); > + } catch (error &e) { > + switch (e.get()) { > +case CL_INVALID_COMPILER_OPTIONS: > +case CL_INVALID_LINKER_OPTIONS: > + e = error(CL_INVALID_BUILD_OPTIONS); > +
Re: [Mesa-dev] [PATCH 2/3] clover: override ret_object
EdB writes: > Return an API object from an intrusive smart reference Clover object, > incrementing the reference count of the object. > --- > src/gallium/state_trackers/clover/api/util.hpp | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/api/util.hpp > b/src/gallium/state_trackers/clover/api/util.hpp > index 918df61..6af28f2 100644 > --- a/src/gallium/state_trackers/clover/api/util.hpp > +++ b/src/gallium/state_trackers/clover/api/util.hpp > @@ -61,6 +61,18 @@ namespace clover { > *p = desc(v()); >} > } > + > + /// > + /// Return an API object from an intrusive smart reference Clover object, > + /// incrementing the reference count of the object. > + /// The function below looks OK, but the explanation doesn't make much sense to me. How about "[...] from an intrusive reference to a Clover object [...]"? With that fixed: Reviewed-by: Francisco Jerez > + template > + typename T::descriptor_type * > + ret_object(const intrusive_ref &v) { > + v().retain(); > + return desc(v()); > + } > + > } > > #endif > -- > 2.4.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] clover: separate compile and link stages
On Sunday 05 July 2015 18:15:33 Francisco Jerez wrote: > Hi EdB, a bunch of comments inline, Hello > > EdB writes: > > --- > > > > src/gallium/state_trackers/clover/api/program.cpp | 6 +- > > .../state_trackers/clover/core/compiler.hpp| 7 +- > > src/gallium/state_trackers/clover/core/error.hpp | 21 ++ > > src/gallium/state_trackers/clover/core/program.cpp | 93 ++- > > src/gallium/state_trackers/clover/core/program.hpp | 10 +- > > .../state_trackers/clover/llvm/invocation.cpp | 281 > > +++-- 6 files changed, 323 insertions(+), 95 > > deletions(-) > > > > diff --git a/src/gallium/state_trackers/clover/api/program.cpp > > b/src/gallium/state_trackers/clover/api/program.cpp index > > e9b1f38..2441d81 100644 > > --- a/src/gallium/state_trackers/clover/api/program.cpp > > +++ b/src/gallium/state_trackers/clover/api/program.cpp > > @@ -184,10 +184,6 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs, > > > > prog.build(devs, opts); > > I don't think there's any reason to keep the program::build method > around anymore, it's only going to be called from this entry point so > you could as well make the two function calls to ::compile() and > > ::link() directly from here. ok > :: > > return CL_SUCCESS; > > > > } catch (error &e) { > > > > - if (e.get() == CL_INVALID_COMPILER_OPTIONS) > > - return CL_INVALID_BUILD_OPTIONS; > > - if (e.get() == CL_COMPILE_PROGRAM_FAILURE) > > - return CL_BUILD_PROGRAM_FAILURE; > > > > return e.get(); > > > > } > > > > @@ -224,7 +220,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, > > > >range(header_names, num_headers), > >objs(d_header_progs, num_headers)); > > > > - prog.build(devs, opts, headers); > > + prog.compile(devs, opts, headers); > > > > return CL_SUCCESS; > > > > } catch (error &e) { > > > > diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp > > b/src/gallium/state_trackers/clover/core/compiler.hpp index > > c68aa39..31fb6ee 100644 > > --- a/src/gallium/state_trackers/clover/core/compiler.hpp > > +++ b/src/gallium/state_trackers/clover/core/compiler.hpp > > @@ -32,11 +32,16 @@ namespace clover { > > > > module compile_program_llvm(const std::string &source, > > > > const header_map &headers, > > > > - pipe_shader_ir ir, > > > > const std::string &target, > > const std::string &opts, > > std::string &r_log); > > > > + module link_program_llvm(const std::vector &modules, > > +enum pipe_shader_ir ir, > > +const std::string &target, > > +const std::string &opts, > > +std::string &r_log); > > + > > > > module compile_program_tgsi(const std::string &source); > > > > } > > > > diff --git a/src/gallium/state_trackers/clover/core/error.hpp > > b/src/gallium/state_trackers/clover/core/error.hpp index 780b973..3c1bf90 > > 100644 > > --- a/src/gallium/state_trackers/clover/core/error.hpp > > +++ b/src/gallium/state_trackers/clover/core/error.hpp > > @@ -68,10 +68,31 @@ namespace clover { > > > > class build_error : public error { > > > > public: > >build_error(const std::string &what = "") : > > + error(CL_BUILD_PROGRAM_FAILURE, what) { > > + } > > + }; > > + > > This exception class now seems redundant -- With program::build() gone > build is no longer a thing. It's still needed by tgsi. I plan to rework this part later to make it consistent with the way it's handle in llvm/invocation but first off I wanted to be done with clLink :/ > > > + class compile_error : public error { > > + public: > > > > + compile_error(const std::string &what = "") : > > error(CL_COMPILE_PROGRAM_FAILURE, what) { > > > >} > > > > }; > > > > + class link_error : public error { > > + public: > > + link_error(const std::string &what = "") : > > + error(CL_LINK_PROGRAM_FAILURE, what) { > > + } > > + }; > > + > > + class link_option_error : public error { > > + public: > > + link_option_error(const std::string &what = "") : > > + error(CL_INVALID_LINKER_OPTIONS , what) { > > + } > > + }; > > + > > I don't think you really need to special-case link_option_error against > the less specific clover::error class? clLinkProgram should not create a program if it failed to parse the given options, I will use this class to handle this case. Other case should create the said program. That said, it could also have been created in later patch. > > > template > > class invalid_object_error; > > > > diff --git a/src/gallium/state_trackers/clover/core/program.cpp > > b/src/gallium/state_trackers/clover/core/program.cpp index > > 0d6cc4
Re: [Mesa-dev] [PATCH 1/3] clover: separate compile and link stages
EdB writes: > On Sunday 05 July 2015 18:15:33 Francisco Jerez wrote: >>[...] >> > --- a/src/gallium/state_trackers/clover/core/error.hpp >> > +++ b/src/gallium/state_trackers/clover/core/error.hpp >> > @@ -68,10 +68,31 @@ namespace clover { >> > >> > class build_error : public error { >> > >> > public: >> >build_error(const std::string &what = "") : >> > + error(CL_BUILD_PROGRAM_FAILURE, what) { >> > + } >> > + }; >> > + >> >> This exception class now seems redundant -- With program::build() gone >> build is no longer a thing. > > It's still needed by tgsi. > I plan to rework this part later to make it consistent with the way it's > handle in llvm/invocation but first off I wanted to be done with clLink :/ > The tgsi path could also throw compile_error AFAICT? >> >> > + class compile_error : public error { >> > + public: >> > >> > + compile_error(const std::string &what = "") : >> > error(CL_COMPILE_PROGRAM_FAILURE, what) { >> > >> >} >> > >> > }; >> > >> > + class link_error : public error { >> > + public: >> > + link_error(const std::string &what = "") : >> > + error(CL_LINK_PROGRAM_FAILURE, what) { >> > + } >> > + }; >> > + >> > + class link_option_error : public error { >> > + public: >> > + link_option_error(const std::string &what = "") : >> > + error(CL_INVALID_LINKER_OPTIONS , what) { >> > + } >> > + }; >> > + >> >> I don't think you really need to special-case link_option_error against >> the less specific clover::error class? > > clLinkProgram should not create a program if it failed to parse the given > options, I will use this class to handle this case. Other case should create > the said program. > That said, it could also have been created in later patch. > Ah, fair enough. >> >> > template >> > class invalid_object_error; >> > >> > diff --git a/src/gallium/state_trackers/clover/core/program.cpp >> > b/src/gallium/state_trackers/clover/core/program.cpp index >> > 0d6cc40..21faf4e 100644 >> > --- a/src/gallium/state_trackers/clover/core/program.cpp >> > +++ b/src/gallium/state_trackers/clover/core/program.cpp >>[...] >> >> > + std::string log; >> > + >> > + try { >> > + auto module = link_program_llvm(mods, >> > + dev.ir_format(), >> > dev.ir_target(), >> > + opts, log); >> > + _binaries.insert({ &dev, module }); >> > + append_to_log(&dev, log); >> > + } catch (const link_option_error &) { >> > + append_to_log(&dev, log); >> > + throw; >> > + } catch (const error &) { >> > + append_to_log(&dev, log); >> > + r = false; >> >> I suggest you just catch "const error &", update the error log and >> rethrow here, so you save a catch block and an error subclass. > > As explain clLinkProgram doesn't behave the same way regarding error during > option parsing and after. > Still I doubt that you need to handle them separately here, just update the log and rethrow whatever you got, clLinkProgram can still give link_error special treatment and return the created program despite the failure for the application to be able to read back the error log. >[...] signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] i965/vec4: Plumb log_data through so the backend_shader field gets set.
On Fri, Jul 03, 2015 at 09:29:16AM -0700, Kenneth Graunke wrote: > On Friday, July 03, 2015 10:50:52 AM Pohjolainen, Topi wrote: > > On Wed, Jul 01, 2015 at 03:03:31PM -0700, Kenneth Graunke wrote: > > > Jason plumbed this through a while back in the FS backend, but > > > apparently we were just passing NULL in the vec4 backend. > > > > > > This patch passes brw in as intended. > > > > > > Signed-off-by: Kenneth Graunke > > > --- > > > src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- > > > src/mesa/drivers/dri/i965/brw_vec4.h | 1 + > > > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 10 ++ > > > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 1 + > > > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 3 ++- > > > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 4 +++- > > > src/mesa/drivers/dri/i965/brw_vs.h| 1 + > > > src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 4 +++- > > > 8 files changed, 18 insertions(+), 8 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > index a5c686c..2a56564 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > @@ -1940,7 +1940,7 @@ brw_vs_emit(struct brw_context *brw, > > > if (!assembly) { > > >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > > > > > - vec4_vs_visitor v(brw->intelScreen->compiler, > > > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, > > > c, prog_data, prog, mem_ctx, st_index, > > > !_mesa_is_gles3(&brw->ctx)); > > >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > > > b/src/mesa/drivers/dri/i965/brw_vec4.h > > > index 2ac1693..043557b 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4.h > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.h > > > @@ -77,6 +77,7 @@ class vec4_visitor : public backend_shader, public > > > ir_visitor > > > { > > > public: > > > vec4_visitor(const struct brw_compiler *compiler, > > > +void *log_data, > > > > As far as I can see, all the constructors addressed in this patch are > > "struct brw_context" aware. Could we use the type "struct brw_context *" > > instead of "void *"? The pointer is in the end given to > > shader_perf_log_mesa() > > which in turn unconditionally casts is to "struct brw_context *". > > Jason is trying to separate the compiler backend from the OpenGL driver, > so we can more easily reuse it...elsewhere :) "elsewhere" does not have > a brw_context, but will have some other structure. So instead he made the > logging functions pass a void * closure. > > I'm also concerned that if we pass in brw_context that people will start > using it everywhere. Admittedly, having to type log_data-> might deter > them, though... Okay, there is a specific reason for it, and in that case I'm fine with the interface as is. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator
On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez wrote: > Heh, I happened to come across this comment yesterday while looking for > the remaining no16 calls and wondered why on earth it couldn't do the > same that the normal interpolation code does. After this patch and a > series coming up that will remove all SIMD8 fallbacks from the texturing > code, the only case left still applicable to Gen7 hardware and later > will be "SIMD16 explicit accumulator operands unsupported". Anyone? I can explain the problem: Prior to Gen7, the were were two accumulator registers usable for most datatypes (acc0, acc1). On Gen7, they removed integer-support from acc1, which was necessary to implement SIMD16 integer multiplication using the normal MUL/MACH sequence. I implemented 32-bit integer multiplication without using the accumulator in: commit f7df169ba13d22338e9276839a7e9629ca0a6b4f Author: Matt Turner Date: Wed May 13 18:34:03 2015 -0700 i965/fs: Implement integer multiply without mul/mach. The remaining cases of "SIMD16 explicit accumulator operands unsupported" are ADDC, SUBB, and 32x32 -> high 32-bit multiplication. The remaining multiplication case can probably be reimplemented without the accumulator, like I did for the low 32-bit result. The ADDC and SUBB instructions implicitly write a bit to the accumulator if their operations overflowed. The 1Q/2Q quarter control is supposed to select which register is implicitly written -- except that there is no acc1 for integer types. Haswell and newer ignore the quarter control and always write acc0, but IVB (and presumably BYT) attempt to write to the nonexistent acc1. You could split the the SIMD16 operations into 2x SIMD8s and set force_writemask_all on the second, followed by a 2Q MOV from the accumulator. Maybe we'd rather use the .o (overflow) conditional mod on a result ADD to implement this. Ideally, we'd recognize merge the addition and carry operations into a single ADDC instruction, but it's pretty unimportant. It's all pretty academic -- I've never seen an application use either operation (or [iu]mulExtended either). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/6] i965: Upload binding tables in hw-generated binding table format.
On Fri, Jul 3, 2015 at 12:00 AM, Abdiel Janulgue wrote: > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c > b/src/mesa/drivers/dri/i965/brw_binding_tables.c > index 7396c94..34a88b9 100644 > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c > @@ -70,30 +90,51 @@ brw_upload_binding_table(struct brw_context *brw, > >stage_state->bind_bo_offset = 0; > } else { > - /* Upload a new binding table. */ > - if (INTEL_DEBUG & DEBUG_SHADER_TIME) { > - brw->vtbl.emit_buffer_surface_state( > -brw, &stage_state->surf_offset[ > -prog_data->binding_table.shader_time_start], > -brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW, > -brw->shader_time.bo->size, 1, true); > + /* When RS is enabled use hw-binding table uploads, otherwise fallback > to > + * software-uploads. > + */ > + if (brw->use_resource_streamer) { > + gen7_update_binding_table_from_array(brw, stage_state->stage, > + stage_state->surf_offset, > + prog_data->binding_table > + .size_bytes / 4); > + } else { > + /* Upload a new binding table. */ > + if (INTEL_DEBUG & DEBUG_SHADER_TIME) { > +brw->vtbl.emit_buffer_surface_state( > + brw, &stage_state->surf_offset[ > + prog_data->binding_table.shader_time_start], > + brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW, > + brw->shader_time.bo->size, 1, true); > + } > + > + uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE, > + > prog_data->binding_table.size_bytes, > + 32, > + &stage_state->bind_bo_offset); > + > + /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */ > + memcpy(bind, stage_state->surf_offset, > +prog_data->binding_table.size_bytes); >} > - > - uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE, > - prog_data->binding_table.size_bytes, > 32, > - &stage_state->bind_bo_offset); > - > - /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */ > - memcpy(bind, stage_state->surf_offset, > - prog_data->binding_table.size_bytes); > } > > brw->ctx.NewDriverState |= brw_new_binding_table; > > if (brw->gen >= 7) { > + Extra newline. > + if (brw->use_resource_streamer) > + stage_state->bind_bo_offset = > +reserve_hw_bt_space(brw, prog_data->binding_table.size_bytes); We usually put braces around nested if statements, but *always* put braces around if statements if they stretch across multiple lines (see commit 22af95af). > + >BEGIN_BATCH(2); >OUT_BATCH(packet_name << 16 | (2 - 2)); > - OUT_BATCH(stage_state->bind_bo_offset); > + /* Align SurfaceStateOffset[16:6] format to [15:5] PS Binding Table > field > + * when hw-generated binding table is enabled. > + */ > + OUT_BATCH(brw->use_resource_streamer ? > +(stage_state->bind_bo_offset >> 1) : > +stage_state->bind_bo_offset); >ADVANCE_BATCH(); > } > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator
Hi Matt, Matt Turner writes: > On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez wrote: >> Heh, I happened to come across this comment yesterday while looking for >> the remaining no16 calls and wondered why on earth it couldn't do the >> same that the normal interpolation code does. After this patch and a >> series coming up that will remove all SIMD8 fallbacks from the texturing >> code, the only case left still applicable to Gen7 hardware and later >> will be "SIMD16 explicit accumulator operands unsupported". Anyone? > > I can explain the problem: > > Prior to Gen7, the were were two accumulator registers usable for most > datatypes (acc0, acc1). On Gen7, they removed integer-support from > acc1, which was necessary to implement SIMD16 integer multiplication > using the normal MUL/MACH sequence. IIRC they got rid of the acc1 register on IVB altogether, but managed to emulate it for floating point types by taking advantage of the extra precision not normally used for floating point arithmetic (the fake acc1 basically uses the same storage in the EU that holds the 32 MSBs of each component of acc0), what explains the apparent asymmetry between integer and floating point data types. > I implemented 32-bit integer multiplication without using the > accumulator in: > > commit f7df169ba13d22338e9276839a7e9629ca0a6b4f > Author: Matt Turner > Date: Wed May 13 18:34:03 2015 -0700 > > i965/fs: Implement integer multiply without mul/mach. > > The remaining cases of "SIMD16 explicit accumulator operands > unsupported" are ADDC, SUBB, and 32x32 -> high 32-bit multiplication. > The remaining multiplication case can probably be reimplemented > without the accumulator, like I did for the low 32-bit result. > Hmm, I have the suspicion that high 32-bit multiplication is the one legit use-case of the accumulator we have left, any algorithm breaking it up into individual 32/16-bit MULs would end up doing more multiplications than the two MUL/MACH instructions we do now, because we wouldn't be able to take advantage of the full precision implemented in the hardware if we truncate the 48-bit intermediate results to fit in a 32-bit register. How about we use the SIMD width lowering pass to split the computation in half? It should be quite straightforward but will probably require adding a new virtual opcode so that the SIMD width lowering pass doesn't have to deal with (seriously fucked-up) accumulators directly. > The ADDC and SUBB instructions implicitly write a bit to the > accumulator if their operations overflowed. The 1Q/2Q quarter control > is supposed to select which register is implicitly written -- except > that there is no acc1 for integer types. Haswell and newer ignore the > quarter control and always write acc0, but IVB (and presumably BYT) > attempt to write to the nonexistent acc1. > > You could split the the SIMD16 operations into 2x SIMD8s and set > force_writemask_all on the second, followed by a 2Q MOV from the > accumulator. Maybe we'd rather use the .o (overflow) conditional mod > on a result ADD to implement this. > Yeah. I did in fact try to implement uaddCarry last Friday without using the accumulator by doing something like: | CMP.o tmp, src0, -src1 | MOV dst, -tmp ...what of course didn't work because of the extra argument precision post-source modifiers and also because the .o condmod doesn't work at all on CMP, but... > Ideally, we'd recognize merge the addition and carry operations into a > single ADDC instruction, but it's pretty unimportant. It's all pretty > academic -- I've never seen an application use either operation (or > [iu]mulExtended either). ...if we did the following instead: | ADD tmp, src0, src1 | CMP.l tmp, tmp, src0 | MOV dst, -tmp the ADD could be easily CSE'ed with the original ADD instruction (and the source modifier of the last MOV can also be easily propagated into some other instruction), so even though it seems like one instruction more than what we emit now it might be a net win (aside from it working on SIMD16). usubBorrow is even easier: | CMP.l tmp, src0, src1 | MOV dst, -tmp I was planning to run it through shader-db tomorrow but if you say you've never seen them used I guess I shouldn't get my hopes too high? :P signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91149] make check optimization-test regression
https://bugs.freedesktop.org/show_bug.cgi?id=91149 Matt Turner changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Matt Turner --- Fixed by commit 83984f134b4a1e2829cb238c404bc82c98be6082 Author: Erik Faye-Lund Date: Fri Jul 3 09:46:01 2015 +0200 glsl: add a missing call to _mesa_locale_init -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: move max_index calc to assign_attribute_or_color_locations
On Fri, Jul 3, 2015 at 4:54 AM, Tapani Pälli wrote: > Change function to get all gl_constants for inspection, this is used > by follow-up patch. > > Signed-off-by: Tapani Pälli > --- > src/glsl/linker.cpp | 16 > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > index 71a45e8..aae0c0d 100644 > --- a/src/glsl/linker.cpp > +++ b/src/glsl/linker.cpp > @@ -1953,9 +1953,17 @@ find_available_slots(unsigned used_mask, unsigned > needed_count) > */ > bool > assign_attribute_or_color_locations(gl_shader_program *prog, > - unsigned target_index, > - unsigned max_index) > +struct gl_constants *constants, > +unsigned target_index) > { > + /* FINISHME: The value of the max_attribute_index parameter is > +* FINISHME: implementation dependent based on the value of > +* FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be > +* FINISHME: at least 16, so hardcode 16 for now. > +*/ > + unsigned max_index = (target_index == MESA_SHADER_VERTEX) ? 16 : > + MAX2(constants->MaxDrawBuffers, constants->MaxDualSourceDrawBuffers); Shouldn't that just be constants->Program[MESA_SHADER_VERTEX].MaxAttribs ? > + > /* Mark invalid locations as being used. > */ > unsigned used_locations = (max_index >= 32) > @@ -3061,11 +3069,11 @@ link_shaders(struct gl_context *ctx, struct > gl_shader_program *prog) > * FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be > * FINISHME: at least 16, so hardcode 16 for now. > */ > - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 16)) { > + if (!assign_attribute_or_color_locations(prog, &ctx->Const, > MESA_SHADER_VERTEX)) { >goto done; > } > > - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_FRAGMENT, > MAX2(ctx->Const.MaxDrawBuffers, ctx->Const.MaxDualSourceDrawBuffers))) { > + if (!assign_attribute_or_color_locations(prog, &ctx->Const, > MESA_SHADER_FRAGMENT)) { >goto done; > } > > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: move max_index calc to assign_attribute_or_color_locations
On Sun, Jul 5, 2015 at 8:17 PM, Ilia Mirkin wrote: > On Fri, Jul 3, 2015 at 4:54 AM, Tapani Pälli wrote: >> Change function to get all gl_constants for inspection, this is used >> by follow-up patch. >> >> Signed-off-by: Tapani Pälli >> --- >> src/glsl/linker.cpp | 16 >> 1 file changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp >> index 71a45e8..aae0c0d 100644 >> --- a/src/glsl/linker.cpp >> +++ b/src/glsl/linker.cpp >> @@ -1953,9 +1953,17 @@ find_available_slots(unsigned used_mask, unsigned >> needed_count) >> */ >> bool >> assign_attribute_or_color_locations(gl_shader_program *prog, >> - unsigned target_index, >> - unsigned max_index) >> +struct gl_constants *constants, >> +unsigned target_index) >> { >> + /* FINISHME: The value of the max_attribute_index parameter is >> +* FINISHME: implementation dependent based on the value of >> +* FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be >> +* FINISHME: at least 16, so hardcode 16 for now. >> +*/ >> + unsigned max_index = (target_index == MESA_SHADER_VERTEX) ? 16 : >> + MAX2(constants->MaxDrawBuffers, constants->MaxDualSourceDrawBuffers); > > Shouldn't that just be constants->Program[MESA_SHADER_VERTEX].MaxAttribs ? > >> + >> /* Mark invalid locations as being used. >> */ >> unsigned used_locations = (max_index >= 32) >> @@ -3061,11 +3069,11 @@ link_shaders(struct gl_context *ctx, struct >> gl_shader_program *prog) >> * FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be >> * FINISHME: at least 16, so hardcode 16 for now. And this copy of the comment should also go away >> */ >> - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 16)) { >> + if (!assign_attribute_or_color_locations(prog, &ctx->Const, >> MESA_SHADER_VERTEX)) { >>goto done; >> } >> >> - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_FRAGMENT, >> MAX2(ctx->Const.MaxDrawBuffers, ctx->Const.MaxDualSourceDrawBuffers))) { >> + if (!assign_attribute_or_color_locations(prog, &ctx->Const, >> MESA_SHADER_FRAGMENT)) { >>goto done; >> } >> >> -- >> 2.1.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator
On Monday, July 06, 2015 02:45:59 AM Francisco Jerez wrote: > Matt Turner writes: > > On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez > > wrote: [snip] > Yeah. I did in fact try to implement uaddCarry last Friday without > using the accumulator by doing something like: > > | CMP.o tmp, src0, -src1 > | MOV dst, -tmp > > ...what of course didn't work because of the extra argument precision > post-source modifiers and also because the .o condmod doesn't work at > all on CMP, but... > > > Ideally, we'd recognize merge the addition and carry operations into a > > single ADDC instruction, but it's pretty unimportant. It's all pretty > > academic -- I've never seen an application use either operation (or > > [iu]mulExtended either). > > ...if we did the following instead: > > | ADD tmp, src0, src1 > | CMP.l tmp, tmp, src0 > | MOV dst, -tmp > > the ADD could be easily CSE'ed with the original ADD instruction (and > the source modifier of the last MOV can also be easily propagated into > some other instruction), so even though it seems like one instruction > more than what we emit now it might be a net win (aside from it working > on SIMD16). usubBorrow is even easier: > > | CMP.l tmp, src0, src1 > | MOV dst, -tmp > > I was planning to run it through shader-db tomorrow but if you say > you've never seen them used I guess I shouldn't get my hopes too high? :P Yeah, there's nothing in shader-db that uses them, so I wouldn't bother. I'm definitely a huge fan of avoiding the accumulator - it's always been a total pain to deal with. If you guys come up with a solution that uses normal registers and avoids MACH, that sounds great to me. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 39/78] i965/nir/vec4: Add swizzle utility method for vector ops
On Fri, Jul 3, 2015 at 10:23 AM, Antía Puentes wrote: > Hi Jason, > > On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote: >> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev wrote: >> > From: Antia Puentes >> > >> > For operations that have a predefined operand size > 0, defined in >> > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the >> > components from outside the source vector. However, the driver >> > expects those components to have a swizzle value equal to the swizzle >> > of the component in (number of vector elements - 1). For example, >> > >> >for a vec2 operation with an identity swizzle (.xy): >> > - (NIR -> XYXX, driver ->X) >> > >> >for a vec3 operation with swizzle (.zxy) >> > - (NIR-> ZXYX, driver -> ZXYY) >> >> Why is this needed. Was there some bug you ran into regarding this or >> are you just trying to make the generated code match? If the >> component isn't used, then it seems to me like the swizzle shouldn't >> matter. NIR may give you bogus swizzles outside of the enabled >> channels but it may do that regardless. I'm also not seeing how >> composing swizzles fixes anything. > > There were several bugs in our NIR->vec4 code path related to NIR > returning bogus swizzles outside the components of a vector. When doing > an "any", "equal" or "notEqual" operation on vectors, the result of the > instruction "cmp" is predicated using "all4h" or "any4h" depending on > the operation. The results were incorrect for vectors with less than 4 > elements because: (1) I was not setting the writemask of the "cmp" > operation according to the vector size and (2) NIR returns bogus > swizzles for components outside the vector. As a consequence, we had > spurious component-wise comparisons that affected the result of the > predication. > > As composing the operand swizzle with the identity swizzle-per-size > means to repeat the value of the swizzle of the last vector-element on > the components outside the vector, the error mentioned above does not > happen because to repeat the last (inside-boundaries) component-wise > comparison for the components outside the vector does not affect the > result of the predication. > > Other way to fix the bugs above is to set the writemask of the "cmp" > operation to the appropriate mask according to the vector size, thus > forcing the "cmp" instruction to compare only the components inside the > vector boundaries. As the vec4_visitor does not set the writemask of the > "cmp" instruction because it relies on the operand having the correct > swizzle outside the vector boundaries, for consistency with the existing > behaviour, I decided to do the same and compose the swizzle (notice that > the composing is done in the vec4_visitor when visiting an ir_swizzle, > while in NIR we only set the swizzle for the elements inside the > vector). I can remove this function if you prefer setting the writemask > over composing the swizzles. I could go either way. I kind of like the idea of setting the writemask rather than the swizzle, but I'm fine with either, now that you've explained it. If we go with fixing the swizzle, I think I'd like to see a few changes. First, I think I'd rather see this as part of the loop at the top of nir_emit_alu that sets up the sources. Second, your implementation always uses src[0] as representative of all fixed-sized sources; this works now but is not guaranteed. Third, I think it needs a better comment. The current comment shows differences between what NIR gives you in some cases and what the old visitor gives but, as far as I can tell, doesn't actually demonstrate something that would be a bug without it. The explanation you gave above together with a better example would probably be sufficient. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 19/78] nir/nir_lower_io: Add vec4 support
On Fri, Jul 3, 2015 at 12:58 AM, Iago Toral wrote: > On Thu, 2015-07-02 at 09:31 +0200, Iago Toral wrote: >> On Tue, 2015-06-30 at 11:32 -0700, Jason Ekstrand wrote: >> > I'm not sure what I think about adding an is_scalar flag vs. having >> > _scalar and _vec4 versions of each function. My feeling is that once >> > we tweak assign_var_locations as I mentioned for vec4 outputs, we'll >> > find that we want to have them separate. The big thing here is that >> > I'd rather have _vec4 and _scalar versions internally call the same >> > function than have a general-looking function that switches inside on >> > is_scalar and does two completely different things. So if we can >> > share code like this for all of them, is_scalar is totally fine. If >> > we start having divergence, different scalar/vec4 functions is >> > probably better. >> > --Jason >> >> Ok, let's see how the changes to vec4 outputs affects this. If we end up >> needing more changes than just the kind of type_size() function we need >> to call then I'll look into having separate paths for scalar and vec4. > > Notice that this patch already changes how assign_var_locations works > (that calls type_size which is the one thing affected by the is_scalar > flag). The mappings you saw in the implementations of the input and > output intrinsics stopped being necessary with this patch and we will > just remove them, it just happened that I fixed it for uniforms and did > not realize that the same situation affected inputs and outputs, but we > just confirmed that with this patch the indirections are not needed > there either. > > So, as things stand now, the changes in this patch are the only changes > we need to nir_lower_io for vec4 shaders, at least for now. Knowing > that, are you okay with this implementation or would you rather change > it to have separate functions for vec4 and scalar? That's fine. --Jason > Iago > >> > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev >> > wrote: >> > > From: Iago Toral Quiroga >> > > >> > > The current implementation operates in scalar mode only, so add a vec4 >> > > mode where types are padded to vec4 sizes. >> > > >> > > This will be useful in the i965 driver for its vec4 nir backend >> > > (and possbly other drivers that have vec4-based shaders). >> > > >> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 >> > > --- >> > > src/glsl/nir/nir.h | 18 >> > > src/glsl/nir/nir_lower_io.c | 87 >> > > + >> > > src/mesa/drivers/dri/i965/brw_nir.c | 18 +--- >> > > 3 files changed, 90 insertions(+), 33 deletions(-) >> > > >> > > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h >> > > index 697d37e..334abbc 100644 >> > > --- a/src/glsl/nir/nir.h >> > > +++ b/src/glsl/nir/nir.h >> > > @@ -1640,14 +1640,16 @@ void nir_lower_global_vars_to_local(nir_shader >> > > *shader); >> > > >> > > void nir_lower_locals_to_regs(nir_shader *shader); >> > > >> > > -void nir_assign_var_locations_scalar(struct exec_list *var_list, >> > > - unsigned *size); >> > > -void nir_assign_var_locations_scalar_direct_first(nir_shader *shader, >> > > - struct exec_list >> > > *var_list, >> > > - unsigned *direct_size, >> > > - unsigned *size); >> > > - >> > > -void nir_lower_io(nir_shader *shader); >> > > +void nir_assign_var_locations(struct exec_list *var_list, >> > > + unsigned *size, >> > > + bool is_scalar); >> > > +void nir_assign_var_locations_direct_first(nir_shader *shader, >> > > + struct exec_list *var_list, >> > > + unsigned *direct_size, >> > > + unsigned *size, >> > > + bool is_scalar); >> > > + >> > > +void nir_lower_io(nir_shader *shader, bool is_scalar); >> > > >> > > void nir_lower_vars_to_ssa(nir_shader *shader); >> > > >> > > diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c >> > > index 6761d5b..3ecd386 100644 >> > > --- a/src/glsl/nir/nir_lower_io.c >> > > +++ b/src/glsl/nir/nir_lower_io.c >> > > @@ -29,19 +29,56 @@ >> > > /* >> > > * This lowering pass converts references to input/output variables with >> > > * loads/stores to actual input/output intrinsics. >> > > - * >> > > - * NOTE: This pass really only works for scalar backends at the moment >> > > due >> > > - * to the way it packes the input/output data. >> > > */ >> > > >> > > #include "nir.h" >> > > >> > > struct lower_io_state { >> > > void *mem_ctx; >> > > + bool is_scalar; >> > > }; >> > > >> > > +static int >> > > +type_size_vec4(const struct glsl_type *type) >> > > +{ >> > > + unsigned int i; >> > > + int size;
Re: [Mesa-dev] [PATCH 13/78] i965/nir/vec4: Implement conditional statements (nir_cf_node_if)
On Fri, Jul 3, 2015 at 12:32 AM, Iago Toral wrote: > On Thu, 2015-07-02 at 10:11 -0700, Jason Ekstrand wrote: >> On Wed, Jul 1, 2015 at 11:44 PM, Iago Toral wrote: >> > On Tue, 2015-06-30 at 09:30 -0700, Jason Ekstrand wrote: >> >> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev >> >> wrote: >> >> > From: Iago Toral Quiroga >> >> > >> >> > The same we do in the FS NIR backend, only that here we need to consider >> >> > the number of components in the condition and adjust the swizzle >> >> > accordingly. >> >> > >> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 >> >> > --- >> >> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 23 ++- >> >> > 1 file changed, 22 insertions(+), 1 deletion(-) >> >> > >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> > index 1ec75ee..d81b6a7 100644 >> >> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> > @@ -314,7 +314,28 @@ vec4_visitor::nir_emit_cf_list(exec_list *list) >> >> > void >> >> > vec4_visitor::nir_emit_if(nir_if *if_stmt) >> >> > { >> >> > - /* @TODO: Not yet implemented */ >> >> > + /* First, put the condition in f0 */ >> >> > + src_reg condition = get_nir_src(if_stmt->condition, >> >> > BRW_REGISTER_TYPE_D); >> >> > + >> >> > + int num_components = if_stmt->condition.is_ssa ? >> >> > + if_stmt->condition.ssa->num_components : >> >> > + if_stmt->condition.reg.reg->num_components; >> >> > + >> >> > + condition.swizzle = brw_swizzle_for_size(num_components); >> >> > + >> >> > + vec4_instruction *inst = emit(MOV(dst_null_d(), condition)); >> >> > + inst->conditional_mod = BRW_CONDITIONAL_NZ; >> >> >> >> NIR if statements read only one component by definition. There's no >> >> need to do this. >> > >> > I see, we still need to do this explicitly though: >> > >> > condition.swizzle = brw_swizzle_for_size(1); >> > >> > Maybe we should just make get_nir_src() set the swizzle based on the >> > number of components instead so we don't have to do this kind of things >> > after calling that, does that sound better? >> >> Just pass the number of components into get_nir_src()? That sounds fine to >> me. >> --Jason > > Is that better? In this case it is sufficient because it is always 1, > but in other cases where we call get_nir_src I imagine we would also > want to get a register with the swizzle preset based on the number of > components the nir src, right? In that case, the number of components we > would need to pass would come from doing something like: > > int num_components = src.is_ssa ? > src.ssa->num_components : > src.reg.reg->num_components; That's not quite what you want either. NIR doesn't guarantee that the number of components used equals the number of components in the register or ssa value. (We probably want to change that eventually). However, you should always know how many components you have. For ALU operations, it's either 4 (if it has a write-mask) or it's in the op infos; for intrinsics it's either in the intrinsic_infos or in intrin->num_components; for if's it's, always 1, etc. Does that make more sense? Also, this should affect the e-mail that I just sent to Antia. --Jason > and doing this every time we call get_nir_src does not look very nice to > me. Since we are already passing the nir src to get_nir_src, wouldn't it > be better if we had that function do this for us? > > Iago > >> >> >> >> > + emit(IF(BRW_PREDICATE_NORMAL)); >> >> > + >> >> > + nir_emit_cf_list(&if_stmt->then_list); >> >> > + >> >> > + /* note: if the else is empty, dead CF elimination will remove it */ >> >> > + emit(BRW_OPCODE_ELSE); >> >> > + >> >> > + nir_emit_cf_list(&if_stmt->else_list); >> >> > + >> >> > + emit(BRW_OPCODE_ENDIF); >> >> > } >> >> > >> >> > void >> >> > -- >> >> > 2.1.4 >> >> > >> >> > ___ >> >> > mesa-dev mailing list >> >> > mesa-dev@lists.freedesktop.org >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> ___ >> >> mesa-dev mailing list >> >> mesa-dev@lists.freedesktop.org >> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > >> > >> > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/78] i965/nir/vec4: Implement conditional statements (nir_cf_node_if)
On Sun, 2015-07-05 at 19:14 -0700, Jason Ekstrand wrote: > On Fri, Jul 3, 2015 at 12:32 AM, Iago Toral wrote: > > On Thu, 2015-07-02 at 10:11 -0700, Jason Ekstrand wrote: > >> On Wed, Jul 1, 2015 at 11:44 PM, Iago Toral wrote: > >> > On Tue, 2015-06-30 at 09:30 -0700, Jason Ekstrand wrote: > >> >> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev > >> >> wrote: > >> >> > From: Iago Toral Quiroga > >> >> > > >> >> > The same we do in the FS NIR backend, only that here we need to > >> >> > consider > >> >> > the number of components in the condition and adjust the swizzle > >> >> > accordingly. > >> >> > > >> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 > >> >> > --- > >> >> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 23 > >> >> > ++- > >> >> > 1 file changed, 22 insertions(+), 1 deletion(-) > >> >> > > >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> >> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> >> > index 1ec75ee..d81b6a7 100644 > >> >> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> >> > @@ -314,7 +314,28 @@ vec4_visitor::nir_emit_cf_list(exec_list *list) > >> >> > void > >> >> > vec4_visitor::nir_emit_if(nir_if *if_stmt) > >> >> > { > >> >> > - /* @TODO: Not yet implemented */ > >> >> > + /* First, put the condition in f0 */ > >> >> > + src_reg condition = get_nir_src(if_stmt->condition, > >> >> > BRW_REGISTER_TYPE_D); > >> >> > + > >> >> > + int num_components = if_stmt->condition.is_ssa ? > >> >> > + if_stmt->condition.ssa->num_components : > >> >> > + if_stmt->condition.reg.reg->num_components; > >> >> > + > >> >> > + condition.swizzle = brw_swizzle_for_size(num_components); > >> >> > + > >> >> > + vec4_instruction *inst = emit(MOV(dst_null_d(), condition)); > >> >> > + inst->conditional_mod = BRW_CONDITIONAL_NZ; > >> >> > >> >> NIR if statements read only one component by definition. There's no > >> >> need to do this. > >> > > >> > I see, we still need to do this explicitly though: > >> > > >> > condition.swizzle = brw_swizzle_for_size(1); > >> > > >> > Maybe we should just make get_nir_src() set the swizzle based on the > >> > number of components instead so we don't have to do this kind of things > >> > after calling that, does that sound better? > >> > >> Just pass the number of components into get_nir_src()? That sounds fine > >> to me. > >> --Jason > > > > Is that better? In this case it is sufficient because it is always 1, > > but in other cases where we call get_nir_src I imagine we would also > > want to get a register with the swizzle preset based on the number of > > components the nir src, right? In that case, the number of components we > > would need to pass would come from doing something like: > > > > int num_components = src.is_ssa ? > > src.ssa->num_components : > > src.reg.reg->num_components; > > That's not quite what you want either. NIR doesn't guarantee that the > number of components used equals the number of components in the > register or ssa value. (We probably want to change that eventually). > However, you should always know how many components you have. For ALU > operations, it's either 4 (if it has a write-mask) or it's in the op > infos; for intrinsics it's either in the intrinsic_infos or in > intrin->num_components; for if's it's, always 1, etc. > > Does that make more sense? Yes, thanks for explaining this. I recall now that Connor had mentioned something like this once as well. Iago > Also, this should affect the e-mail that I > just sent to Antia. > > --Jason > > > and doing this every time we call get_nir_src does not look very nice to > > me. Since we are already passing the nir src to get_nir_src, wouldn't it > > be better if we had that function do this for us? > > > > Iago > > > >> >> > >> >> > + emit(IF(BRW_PREDICATE_NORMAL)); > >> >> > + > >> >> > + nir_emit_cf_list(&if_stmt->then_list); > >> >> > + > >> >> > + /* note: if the else is empty, dead CF elimination will remove it > >> >> > */ > >> >> > + emit(BRW_OPCODE_ELSE); > >> >> > + > >> >> > + nir_emit_cf_list(&if_stmt->else_list); > >> >> > + > >> >> > + emit(BRW_OPCODE_ENDIF); > >> >> > } > >> >> > > >> >> > void > >> >> > -- > >> >> > 2.1.4 > >> >> > > >> >> > ___ > >> >> > mesa-dev mailing list > >> >> > mesa-dev@lists.freedesktop.org > >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> >> ___ > >> >> mesa-dev mailing list > >> >> mesa-dev@lists.freedesktop.org > >> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> > > >> > > >> > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev