[Mesa-dev] [PATCH v2 5/6] i965: Upload binding tables in hw-generated binding table format.
When hardware-generated binding tables are enabled, use the hw-generated binding table format when uploading binding table state. Normally, the CS will will just consume the binding table pointer commands as pipelined state. When the RS is enabled however, the RS flushes whatever edited surface state entries of our on-chip binding table to the binding table pool before passing the command on to the CS. Note that the the binding table pointer offset is relative to the binding table pool base address when resource streamer instead of the surface state base address. v2: Fix possible buffer overflow when allocating a chunk out of the hw-binding table pool (Ken). Cc: kenn...@whitecape.org Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 73 -- 1 file changed, 57 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 7396c94..34a88b9 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -50,6 +50,26 @@ static const GLuint stage_to_bt_edit[MESA_SHADER_FRAGMENT + 1] = { _3DSTATE_BINDING_TABLE_EDIT_PS, }; +static uint32_t +reserve_hw_bt_space(struct brw_context *brw, unsigned bytes) +{ + if (brw->hw_bt_pool.next_offset + bytes >= brw->hw_bt_pool.bo->size - 128) { + gen7_reset_hw_bt_pool_offsets(brw); + } + + uint32_t offset = brw->hw_bt_pool.next_offset; + + /* From the Haswell PRM, Volume 2b: Command Reference: Instructions, +* 3DSTATE_BINDING_TABLE_POINTERS_xS: +* +* "If HW Binding Table is enabled, the offset is relative to the +* Binding Table Pool Base Address and the alignment is 64 bytes." +*/ + brw->hw_bt_pool.next_offset += ALIGN(bytes, 64); + + return offset; +} + /** * Upload a shader stage's binding table as indirect state. * @@ -70,30 +90,51 @@ brw_upload_binding_table(struct brw_context *brw, stage_state->bind_bo_offset = 0; } else { - /* Upload a new binding table. */ - if (INTEL_DEBUG & DEBUG_SHADER_TIME) { - brw->vtbl.emit_buffer_surface_state( -brw, &stage_state->surf_offset[ -prog_data->binding_table.shader_time_start], -brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW, -brw->shader_time.bo->size, 1, true); + /* When RS is enabled use hw-binding table uploads, otherwise fallback to + * software-uploads. + */ + if (brw->use_resource_streamer) { + gen7_update_binding_table_from_array(brw, stage_state->stage, + stage_state->surf_offset, + prog_data->binding_table + .size_bytes / 4); + } else { + /* Upload a new binding table. */ + if (INTEL_DEBUG & DEBUG_SHADER_TIME) { +brw->vtbl.emit_buffer_surface_state( + brw, &stage_state->surf_offset[ + prog_data->binding_table.shader_time_start], + brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW, + brw->shader_time.bo->size, 1, true); + } + + uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE, + prog_data->binding_table.size_bytes, + 32, + &stage_state->bind_bo_offset); + + /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */ + memcpy(bind, stage_state->surf_offset, +prog_data->binding_table.size_bytes); } - - uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE, - prog_data->binding_table.size_bytes, 32, - &stage_state->bind_bo_offset); - - /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */ - memcpy(bind, stage_state->surf_offset, - prog_data->binding_table.size_bytes); } brw->ctx.NewDriverState |= brw_new_binding_table; if (brw->gen >= 7) { + + if (brw->use_resource_streamer) + stage_state->bind_bo_offset = +reserve_hw_bt_space(brw, prog_data->binding_table.size_bytes); + BEGIN_BATCH(2); OUT_BATCH(packet_name << 16 | (2 - 2)); - OUT_BATCH(stage_state->bind_bo_offset); + /* Align SurfaceStateOffset[16:6] format to [15:5] PS Binding Table field + * when hw-generated binding table is enabled. + */ + OUT_BATCH(brw->use_resource_streamer ? +(stage_state->bind_bo_offset >> 1) : +stage_state->bind_bo_offset); ADVANCE_BATCH(); } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 2/6] i965: Enable resource streamer for the batchbuffer
Check first if the hardware and kernel supports resource streamer. If this is allowed, tell the kernel to enable the resource streamer enable bit on MI_BATCHBUFFER_START by specifying I915_EXEC_RESOURCE_STREAMER execbuffer flags. v2: - Use new I915_PARAM_HAS_RESOURCE_STREAMER ioctl to check if kernel supports RS (Ken). - Add brw_device_info::has_resource_streamer and toggle it for Haswell, Broadwell, Cherryview, Skylake, and Broxton (Ken). v3: - Update I915_PARAM_HAS_RESOURCE_STREAMER to match updated kernel. v4: - Always inspect the getparam.value (Chris Wilson). v5: - Fold redundant devinfo->has_resource_streamer check in context create into init screen. Cc: kenn...@whitecape.org Cc: ch...@chris-wilson.co.uk Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/brw_context.c | 4 src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/brw_device_info.c | 5 - src/mesa/drivers/dri/i965/brw_device_info.h | 1 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 8 +++- src/mesa/drivers/dri/i965/intel_screen.c | 14 ++ src/mesa/drivers/dri/i965/intel_screen.h | 5 + 7 files changed, 36 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 4b51fe5..ec22497 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -865,6 +865,10 @@ brwCreateContext(gl_api api, brw->predicate.state = BRW_PREDICATE_STATE_RENDER; + brw->use_resource_streamer = screen->has_resource_streamer && + (brw_env_var_as_boolean("INTEL_USE_HW_BT", false) || + brw_env_var_as_boolean("INTEL_USE_GATHER", false)); + ctx->VertexProgram._MaintainTnlProgram = true; ctx->FragmentProgram._MaintainTexEnvProgram = true; diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 3553f6e..10d8f1e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1130,6 +1130,7 @@ struct brw_context bool has_pln; bool no_simd8; bool use_rep_send; + bool use_resource_streamer; /** * Some versions of Gen hardware don't do centroid interpolation correctly diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 342e566..51a91b6 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -170,7 +170,8 @@ static const struct brw_device_info brw_device_info_byt = { #define HSW_FEATURES \ GEN7_FEATURES,\ .is_haswell = true, \ - .supports_simd16_3src = true + .supports_simd16_3src = true, \ + .has_resource_streamer = true static const struct brw_device_info brw_device_info_hsw_gt1 = { HSW_FEATURES, .gt = 1, @@ -229,6 +230,7 @@ static const struct brw_device_info brw_device_info_hsw_gt3 = { #define GEN8_FEATURES \ .gen = 8,\ .has_hiz_and_separate_stencil = true,\ + .has_resource_streamer = true, \ .must_use_separate_stencil = true, \ .has_llc = true, \ .has_pln = true, \ @@ -301,6 +303,7 @@ static const struct brw_device_info brw_device_info_chv = { #define GEN9_FEATURES \ .gen = 9,\ .has_hiz_and_separate_stencil = true,\ + .has_resource_streamer = true, \ .must_use_separate_stencil = true, \ .has_llc = true, \ .has_pln = true, \ diff --git a/src/mesa/drivers/dri/i965/brw_device_info.h b/src/mesa/drivers/dri/i965/brw_device_info.h index 7b7a1fc..2a73e93 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.h +++ b/src/mesa/drivers/dri/i965/brw_device_info.h @@ -46,6 +46,7 @@ struct brw_device_info bool has_compr4; bool has_surface_tile_offset; bool supports_simd16_3src; + bool has_resource_streamer; /** * Quirks: diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 54081a1..4e82003 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -267,6 +267,11 @@ throttle(struct brw_context *brw) } } +/* Drop when RS headers get pulled to libdrm */ +#ifndef I915_EXEC_RESOURCE_STREAMER +#define I915_EXEC_RESOURCE_STREAMER (1<<15) +#endif + /* TODO: Push this whole function into bufmgr. */ static int @@ -293,7 +298,8 @@ do_flush_locked(struct brw_context *brw) if (brw->gen >= 6 && batch->ring == BLT_RING) { flags = I915_EXEC_BLT; } else { - flags = I915_EXEC_RENDER; +
[Mesa-dev] [PATCH v4 3/6] i965: Enable hardware-generated binding tables on render path.
This patch implements the binding table enable command which is also used to allocate a binding table pool where where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. v2: - Use MOCS in binding table pool alloc for GEN8 - Fix spurious offset when allocating binding table pool entry and start from zero instead. v3: - Include GEN8 fix for spurious offset above. v4: - Fixup wrong packet length in enable/disable hw-binding table for GEN8 (Ville). - Don't invoke HW-binding table disable command when we dont have resource streamer (Chris). Cc: kenn...@whitecape.org Cc: syrj...@sci.fi Cc: ch...@chris-wilson.co.uk Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 96 ++ src/mesa/drivers/dri/i965/brw_context.c| 4 ++ src/mesa/drivers/dri/i965/brw_context.h| 6 ++ src/mesa/drivers/dri/i965/brw_state.h | 6 ++ src/mesa/drivers/dri/i965/brw_state_upload.c | 4 ++ src/mesa/drivers/dri/i965/gen7_disable.c | 4 +- src/mesa/drivers/dri/i965/gen8_disable.c | 4 +- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 4 ++ 8 files changed, 124 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 98ff0dd..c2cedd6 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -170,6 +170,102 @@ const struct brw_tracked_state brw_gs_binding_table = { .emit = brw_gs_upload_binding_table, }; +/** + * Hardware-generated binding tables for the resource streamer + */ +void +gen7_disable_hw_binding_tables(struct brw_context *brw) +{ + if (!brw->use_resource_streamer) + return; + + int pkt_len = brw->gen >= 8 ? 4 : 3; + + BEGIN_BATCH(pkt_len); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (pkt_len - 2)); + if (brw->gen >= 8) { + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + } else { + OUT_BATCH(HSW_BT_POOL_ALLOC_MUST_BE_ONE); + OUT_BATCH(0); + } + ADVANCE_BATCH(); + + /* From the Haswell PRM, Volume 7: 3D Media GPGPU, +* 3DSTATE_BINDING_TABLE_POOL_ALLOC > Programming Note: +* +* "When switching between HW and SW binding table generation, SW must +* issue a state cache invalidate." +*/ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); +} + +void +gen7_enable_hw_binding_tables(struct brw_context *brw) +{ + if (!brw->use_resource_streamer) + return; + + if (!brw->hw_bt_pool.bo) { + /* We use a single re-usable buffer object for the lifetime of the + * context and size it to maximum allowed binding tables that can be + * programmed per batch: + * + * From the Haswell PRM, Volume 7: 3D Media GPGPU, + * 3DSTATE_BINDING_TABLE_POOL_ALLOC > Programming Note: + * "A maximum of 16,383 Binding tables are allowed in any batch buffer" + */ + static const int max_size = 16383 * 4; + brw->hw_bt_pool.bo = drm_intel_bo_alloc(brw->bufmgr, "hw_bt", + max_size, 64); + brw->hw_bt_pool.next_offset = 0; + } + + int pkt_len = brw->gen >= 8 ? 4 : 3; + uint32_t dw1 = BRW_HW_BINDING_TABLE_ENABLE; + if (brw->is_haswell) + dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_POOL_MOCS) | + HSW_BT_POOL_ALLOC_MUST_BE_ONE; + else if (brw->gen >= 8) + dw1 |= BDW_MOCS_WB; + + BEGIN_BATCH(pkt_len); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (pkt_len - 2)); + if (brw->gen >= 8) { + OUT_RELOC64(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); + OUT_BATCH(brw->hw_bt_pool.bo->size); + } else { + OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); + OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, + brw->hw_bt_pool.bo->size); + } + ADVANCE_BATCH(); + + /* From the Haswell PRM, Volume 7: 3D Media GPGPU, +* 3DSTATE_BINDING_TABLE_POOL_ALLOC > Programming Note: +* +* "When switching between HW and SW binding table generation, SW must +* issue a state cache invalidate." +*/ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); +} + +void +gen7_reset_hw_bt_pool_offsets(struct brw_context *brw) +{ + brw->hw_bt_pool.next_offset = 0; +} + +const struct brw_tracked_state gen7_hw_binding_tables = { + .dirty = { + .mesa = 0, + .brw = BRW_NEW_BATCH, + }, + .emit = gen7_enable_hw_binding_tables +}; + /** @} */ /** diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index ec22497..1952ec8 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/d
[Mesa-dev] [PATCH 6/6] i965: Disable resource streamer in BLORP
Switch off hardware-generated binding tables and gather push constants in the blorp. Blorp requires only a minimal set of simple constants. There is no need for the extra complexity to program a gather table entry into the pipeline. Cc: kenn...@whitecape.org Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index abace6d..9822dc1 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -794,6 +794,8 @@ gen7_blorp_exec(struct brw_context *brw, } depthstencil_offset = gen6_blorp_emit_depth_stencil_state(brw, params); gen7_blorp_emit_depth_stencil_state_pointers(brw, depthstencil_offset); + if (brw->use_resource_streamer) + gen7_disable_hw_binding_tables(brw); if (params->use_wm_prog) { uint32_t wm_surf_offset_renderbuffer; uint32_t wm_surf_offset_texture = 0; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/78] i965/nir/vec4: Implement conditional statements (nir_cf_node_if)
On Thu, 2015-07-02 at 10:11 -0700, Jason Ekstrand wrote: > On Wed, Jul 1, 2015 at 11:44 PM, Iago Toral wrote: > > On Tue, 2015-06-30 at 09:30 -0700, Jason Ekstrand wrote: > >> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev > >> wrote: > >> > From: Iago Toral Quiroga > >> > > >> > The same we do in the FS NIR backend, only that here we need to consider > >> > the number of components in the condition and adjust the swizzle > >> > accordingly. > >> > > >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 > >> > --- > >> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 23 ++- > >> > 1 file changed, 22 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> > index 1ec75ee..d81b6a7 100644 > >> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > >> > @@ -314,7 +314,28 @@ vec4_visitor::nir_emit_cf_list(exec_list *list) > >> > void > >> > vec4_visitor::nir_emit_if(nir_if *if_stmt) > >> > { > >> > - /* @TODO: Not yet implemented */ > >> > + /* First, put the condition in f0 */ > >> > + src_reg condition = get_nir_src(if_stmt->condition, > >> > BRW_REGISTER_TYPE_D); > >> > + > >> > + int num_components = if_stmt->condition.is_ssa ? > >> > + if_stmt->condition.ssa->num_components : > >> > + if_stmt->condition.reg.reg->num_components; > >> > + > >> > + condition.swizzle = brw_swizzle_for_size(num_components); > >> > + > >> > + vec4_instruction *inst = emit(MOV(dst_null_d(), condition)); > >> > + inst->conditional_mod = BRW_CONDITIONAL_NZ; > >> > >> NIR if statements read only one component by definition. There's no > >> need to do this. > > > > I see, we still need to do this explicitly though: > > > > condition.swizzle = brw_swizzle_for_size(1); > > > > Maybe we should just make get_nir_src() set the swizzle based on the > > number of components instead so we don't have to do this kind of things > > after calling that, does that sound better? > > Just pass the number of components into get_nir_src()? That sounds fine to > me. > --Jason Is that better? In this case it is sufficient because it is always 1, but in other cases where we call get_nir_src I imagine we would also want to get a register with the swizzle preset based on the number of components the nir src, right? In that case, the number of components we would need to pass would come from doing something like: int num_components = src.is_ssa ? src.ssa->num_components : src.reg.reg->num_components; and doing this every time we call get_nir_src does not look very nice to me. Since we are already passing the nir src to get_nir_src, wouldn't it be better if we had that function do this for us? Iago > >> > >> > + emit(IF(BRW_PREDICATE_NORMAL)); > >> > + > >> > + nir_emit_cf_list(&if_stmt->then_list); > >> > + > >> > + /* note: if the else is empty, dead CF elimination will remove it */ > >> > + emit(BRW_OPCODE_ELSE); > >> > + > >> > + nir_emit_cf_list(&if_stmt->else_list); > >> > + > >> > + emit(BRW_OPCODE_ENDIF); > >> > } > >> > > >> > void > >> > -- > >> > 2.1.4 > >> > > >> > ___ > >> > mesa-dev mailing list > >> > mesa-dev@lists.freedesktop.org > >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl: add a missing call to _mesa_locale_init
After c61bc6e ("util: port _mesa_strto[df] to C"), "make check" fails due to a missing _mesa_locale_init. Fixup this oversight, by moving the stand-alone compiler initializer inside initialize_context_to_defaults(). Signed-off-by: Erik Faye-Lund --- Here's what the latter suggestion would look like. Personally, I have a slight preference for this version. src/glsl/main.cpp | 3 --- src/glsl/standalone_scaffolding.cpp | 3 +++ 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp index 58651df..2341298 100644 --- a/src/glsl/main.cpp +++ b/src/glsl/main.cpp @@ -38,7 +38,6 @@ #include "program/hash_table.h" #include "loop_analysis.h" #include "standalone_scaffolding.h" -#include "util/strtod.h" static int glsl_version = 330; @@ -47,8 +46,6 @@ initialize_context(struct gl_context *ctx, gl_api api) { initialize_context_to_defaults(ctx, api); - _mesa_locale_init(); - /* The standalone compiler needs to claim support for almost * everything in order to compile the built-in functions. */ diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index 00db61e..172c6f4 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -33,6 +33,7 @@ #include #include #include "util/ralloc.h" +#include "util/strtod.h" void _mesa_warning(struct gl_context *ctx, const char *fmt, ...) @@ -191,4 +192,6 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) for (int sh = 0; sh < MESA_SHADER_STAGES; ++sh) memcpy(&ctx->Const.ShaderCompilerOptions[sh], &options, sizeof(options)); + + _mesa_locale_init(); } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] i965/vec4: Plumb log_data through so the backend_shader field gets set.
On Wed, Jul 01, 2015 at 03:03:31PM -0700, Kenneth Graunke wrote: > Jason plumbed this through a while back in the FS backend, but > apparently we were just passing NULL in the vec4 backend. > > This patch passes brw in as intended. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- > src/mesa/drivers/dri/i965/brw_vec4.h | 1 + > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 10 ++ > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 1 + > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 3 ++- > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 4 +++- > src/mesa/drivers/dri/i965/brw_vs.h| 1 + > src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 4 +++- > 8 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index a5c686c..2a56564 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1940,7 +1940,7 @@ brw_vs_emit(struct brw_context *brw, > if (!assembly) { >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > - vec4_vs_visitor v(brw->intelScreen->compiler, > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, > c, prog_data, prog, mem_ctx, st_index, > !_mesa_is_gles3(&brw->ctx)); >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > b/src/mesa/drivers/dri/i965/brw_vec4.h > index 2ac1693..043557b 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.h > +++ b/src/mesa/drivers/dri/i965/brw_vec4.h > @@ -77,6 +77,7 @@ class vec4_visitor : public backend_shader, public > ir_visitor > { > public: > vec4_visitor(const struct brw_compiler *compiler, > +void *log_data, As far as I can see, all the constructors addressed in this patch are "struct brw_context" aware. Could we use the type "struct brw_context *" instead of "void *"? The pointer is in the end given to shader_perf_log_mesa() which in turn unconditionally casts is to "struct brw_context *". > struct brw_vec4_compile *c, > struct gl_program *prog, > const struct brw_vue_prog_key *key, > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > index 69bcf5a..80c59af 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > @@ -35,12 +35,14 @@ const unsigned MAX_GS_INPUT_VERTICES = 6; > namespace brw { > > vec4_gs_visitor::vec4_gs_visitor(const struct brw_compiler *compiler, > + void *log_data, > struct brw_gs_compile *c, > struct gl_shader_program *prog, > void *mem_ctx, > bool no_spills, > int shader_time_index) > - : vec4_visitor(compiler, &c->base, &c->gp->program.Base, &c->key.base, > + : vec4_visitor(compiler, log_data, > + &c->base, &c->gp->program.Base, &c->key.base, >&c->prog_data.base, prog, MESA_SHADER_GEOMETRY, mem_ctx, >no_spills, shader_time_index), > c(c) > @@ -662,7 +664,7 @@ brw_gs_emit(struct brw_context *brw, >likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) { > c->prog_data.base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > - vec4_gs_visitor v(brw->intelScreen->compiler, > + vec4_gs_visitor v(brw->intelScreen->compiler, brw, > c, prog, mem_ctx, true /* no_spills */, st_index); > if (v.run(NULL /* clip planes */)) { > return generate_assembly(brw, prog, &c->gp->program.Base, > @@ -704,11 +706,11 @@ brw_gs_emit(struct brw_context *brw, > const unsigned *ret = NULL; > > if (brw->gen >= 7) > - gs = new vec4_gs_visitor(brw->intelScreen->compiler, > + gs = new vec4_gs_visitor(brw->intelScreen->compiler, brw, > c, prog, mem_ctx, false /* no_spills */, > st_index); > else > - gs = new gen6_gs_visitor(brw->intelScreen->compiler, > + gs = new gen6_gs_visitor(brw->intelScreen->compiler, brw, > c, prog, mem_ctx, false /* no_spills */, > st_index); > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h > b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h > index e693c56..e48d861 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h > +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h > @@ -69,6 +69,7 @@ class vec4_gs_visitor : public vec4_visitor > { > public: > vec4_gs_visitor(const struct brw_c
Re: [Mesa-dev] [PATCH 19/78] nir/nir_lower_io: Add vec4 support
On Thu, 2015-07-02 at 09:31 +0200, Iago Toral wrote: > On Tue, 2015-06-30 at 11:32 -0700, Jason Ekstrand wrote: > > I'm not sure what I think about adding an is_scalar flag vs. having > > _scalar and _vec4 versions of each function. My feeling is that once > > we tweak assign_var_locations as I mentioned for vec4 outputs, we'll > > find that we want to have them separate. The big thing here is that > > I'd rather have _vec4 and _scalar versions internally call the same > > function than have a general-looking function that switches inside on > > is_scalar and does two completely different things. So if we can > > share code like this for all of them, is_scalar is totally fine. If > > we start having divergence, different scalar/vec4 functions is > > probably better. > > --Jason > > Ok, let's see how the changes to vec4 outputs affects this. If we end up > needing more changes than just the kind of type_size() function we need > to call then I'll look into having separate paths for scalar and vec4. Notice that this patch already changes how assign_var_locations works (that calls type_size which is the one thing affected by the is_scalar flag). The mappings you saw in the implementations of the input and output intrinsics stopped being necessary with this patch and we will just remove them, it just happened that I fixed it for uniforms and did not realize that the same situation affected inputs and outputs, but we just confirmed that with this patch the indirections are not needed there either. So, as things stand now, the changes in this patch are the only changes we need to nir_lower_io for vec4 shaders, at least for now. Knowing that, are you okay with this implementation or would you rather change it to have separate functions for vec4 and scalar? Iago > > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev > > wrote: > > > From: Iago Toral Quiroga > > > > > > The current implementation operates in scalar mode only, so add a vec4 > > > mode where types are padded to vec4 sizes. > > > > > > This will be useful in the i965 driver for its vec4 nir backend > > > (and possbly other drivers that have vec4-based shaders). > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 > > > --- > > > src/glsl/nir/nir.h | 18 > > > src/glsl/nir/nir_lower_io.c | 87 > > > + > > > src/mesa/drivers/dri/i965/brw_nir.c | 18 +--- > > > 3 files changed, 90 insertions(+), 33 deletions(-) > > > > > > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h > > > index 697d37e..334abbc 100644 > > > --- a/src/glsl/nir/nir.h > > > +++ b/src/glsl/nir/nir.h > > > @@ -1640,14 +1640,16 @@ void nir_lower_global_vars_to_local(nir_shader > > > *shader); > > > > > > void nir_lower_locals_to_regs(nir_shader *shader); > > > > > > -void nir_assign_var_locations_scalar(struct exec_list *var_list, > > > - unsigned *size); > > > -void nir_assign_var_locations_scalar_direct_first(nir_shader *shader, > > > - struct exec_list > > > *var_list, > > > - unsigned *direct_size, > > > - unsigned *size); > > > - > > > -void nir_lower_io(nir_shader *shader); > > > +void nir_assign_var_locations(struct exec_list *var_list, > > > + unsigned *size, > > > + bool is_scalar); > > > +void nir_assign_var_locations_direct_first(nir_shader *shader, > > > + struct exec_list *var_list, > > > + unsigned *direct_size, > > > + unsigned *size, > > > + bool is_scalar); > > > + > > > +void nir_lower_io(nir_shader *shader, bool is_scalar); > > > > > > void nir_lower_vars_to_ssa(nir_shader *shader); > > > > > > diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c > > > index 6761d5b..3ecd386 100644 > > > --- a/src/glsl/nir/nir_lower_io.c > > > +++ b/src/glsl/nir/nir_lower_io.c > > > @@ -29,19 +29,56 @@ > > > /* > > > * This lowering pass converts references to input/output variables with > > > * loads/stores to actual input/output intrinsics. > > > - * > > > - * NOTE: This pass really only works for scalar backends at the moment > > > due > > > - * to the way it packes the input/output data. > > > */ > > > > > > #include "nir.h" > > > > > > struct lower_io_state { > > > void *mem_ctx; > > > + bool is_scalar; > > > }; > > > > > > +static int > > > +type_size_vec4(const struct glsl_type *type) > > > +{ > > > + unsigned int i; > > > + int size; > > > + > > > + switch (glsl_get_base_type(type)) { > > > + case GLSL_TYPE_UINT: > > > + case GLSL_TYPE_INT: > > > + case GLSL_TYPE_FLOAT: > > > + case GLSL_TYPE_BOOL: > > > + if
Re: [Mesa-dev] [PATCH 2/6] i965/vec4: Move perf_debug about register spilling into the visitor.
On Wed, Jul 01, 2015 at 03:03:32PM -0700, Kenneth Graunke wrote: > This patch makes us only issue the performance warning about register > spilling if we actually spilled registers. We also use scratch space > for indirect addressing and the like. > > This is basically commit c51163b0cf7aff0375b1a5ea4cb3da9d9e164044 for > the vec4 backend. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_gs.c | 4 > src/mesa/drivers/dri/i965/brw_vec4.cpp | 16 +--- > src/mesa/drivers/dri/i965/brw_vs.c | 4 > 3 files changed, 13 insertions(+), 11 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c > b/src/mesa/drivers/dri/i965/brw_gs.c > index 52c7303..7f947e0 100644 > --- a/src/mesa/drivers/dri/i965/brw_gs.c > +++ b/src/mesa/drivers/dri/i965/brw_gs.c > @@ -268,10 +268,6 @@ brw_codegen_gs_prog(struct brw_context *brw, > > /* Scratch space is used for register spilling */ > if (c.base.last_scratch) { > - perf_debug("Geometry shader triggered register spilling. " > - "Try reducing the number of live vec4 values to " > - "improve performance.\n"); > - >c.prog_data.base.base.total_scratch > = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 2a56564..60f73e2 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1827,9 +1827,19 @@ vec4_visitor::run(gl_clip_plane *clip_planes) >} > } > > - while (!reg_allocate()) { > - if (failed) > - return false; > + bool allocated_without_spills = reg_allocate(); > + > + if (!allocated_without_spills) { > + compiler->shader_perf_log(log_data, > +"%s shader triggered register spilling. " > +"Try reducing the number of live vec4 values > " > +"to improve performance.\n", > +stage_name); > + > + while (!reg_allocate()) { I tried to understand a little how repeating calls to reg_allocate() differ from previous in result wise. I didn't really get it but that doesn't really prevent me from reviewing this patch. This patch preserves the logic while corresponding to the intent in commit message. Reviewed-by: Topi Pohjolainen > + if (failed) > +return false; > + } > } > > opt_schedule_instructions(); > diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > b/src/mesa/drivers/dri/i965/brw_vs.c > index 6e9848f..edbcbcf 100644 > --- a/src/mesa/drivers/dri/i965/brw_vs.c > +++ b/src/mesa/drivers/dri/i965/brw_vs.c > @@ -196,10 +196,6 @@ brw_codegen_vs_prog(struct brw_context *brw, > > /* Scratch space is used for register spilling */ > if (c.base.last_scratch) { > - perf_debug("Vertex shader triggered register spilling. " > - "Try reducing the number of live vec4 values to " > - "improve performance.\n"); > - >prog_data.base.base.total_scratch > = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > > -- > 2.4.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/6] i965/vec4: Move total_scratch calculation into the visitor.
On Wed, Jul 01, 2015 at 03:03:33PM -0700, Kenneth Graunke wrote: > This is more consistent with how we do it in the FS backend, and reduces > a tiny bit of duplication. It'll also allow for a bit more tidying. And it also makes it clearer that code generation doesn't have anything to do with the scratch space allocation. Setting a value as soon as it is available is always better. Reviewed-by: Topi Pohjolainen > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_gs.c | 5 + > src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 +-- > src/mesa/drivers/dri/i965/brw_vs.c | 5 + > 3 files changed, 7 insertions(+), 10 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c > b/src/mesa/drivers/dri/i965/brw_gs.c > index 7f947e0..9c59c8a 100644 > --- a/src/mesa/drivers/dri/i965/brw_gs.c > +++ b/src/mesa/drivers/dri/i965/brw_gs.c > @@ -267,10 +267,7 @@ brw_codegen_gs_prog(struct brw_context *brw, > } > > /* Scratch space is used for register spilling */ > - if (c.base.last_scratch) { > - c.prog_data.base.base.total_scratch > - = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > - > + if (c.prog_data.base.base.total_scratch) { >brw_get_scratch_bo(brw, &stage_state->scratch_bo, >c.prog_data.base.base.total_scratch * > brw->max_gs_threads); > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 60f73e2..7b367ec 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1846,6 +1846,11 @@ vec4_visitor::run(gl_clip_plane *clip_planes) > > opt_set_dependency_control(); > > + if (c->last_scratch > 0) { > + prog_data->base.total_scratch = > + brw_get_scratch_size(c->last_scratch * REG_SIZE); > + } > + > /* If any state parameters were appended, then ParameterValues could have > * been realloced, in which case the driver uniform storage set up by > * _mesa_associate_uniform_storage() would point to freed memory. Make > @@ -1943,8 +1948,6 @@ brw_vs_emit(struct brw_context *brw, >} >g.generate_code(v.cfg, 8); >assembly = g.get_assembly(final_assembly_size); > - > - c->base.last_scratch = v.last_scratch; > } > > if (!assembly) { > diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > b/src/mesa/drivers/dri/i965/brw_vs.c > index edbcbcf..ee3f664 100644 > --- a/src/mesa/drivers/dri/i965/brw_vs.c > +++ b/src/mesa/drivers/dri/i965/brw_vs.c > @@ -195,10 +195,7 @@ brw_codegen_vs_prog(struct brw_context *brw, > } > > /* Scratch space is used for register spilling */ > - if (c.base.last_scratch) { > - prog_data.base.base.total_scratch > - = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > - > + if (prog_data.base.base.total_scratch) { >brw_get_scratch_bo(brw, &brw->vs.base.scratch_bo, >prog_data.base.base.total_scratch * > brw->max_vs_threads); > -- > 2.4.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] i965/vec4: Move c->last_scratch into vec4_visitor.
On Wed, Jul 01, 2015 at 03:03:34PM -0700, Kenneth Graunke wrote: > Nothing outside of vec4_visitor uses it, so we may as well keep it > internal. > > Commit db9c915abcc5ad78d2d11d0e732f04cc94631350 for the vec4 backend. > > (The empty class will be going away soon.) > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 ++-- > src/mesa/drivers/dri/i965/brw_vec4.h| 8 ++-- > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 1 - > src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 - > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_vs.h | 1 - > 8 files changed, 15 insertions(+), 22 deletions(-) Reviewed-by: Topi Pohjolainen ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/6] i965/vs: Remove 'c'/vs_compile from vec4_vs_visitor.
On Wed, Jul 01, 2015 at 03:03:35PM -0700, Kenneth Graunke wrote: > At this point, the brw_vs_compile structure only contains the key and > gl_vertex_program pointer. We may as well pass and store them directly; > it's simpler and more convenient (key-> instead of vs_compile->key...). > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 ++-- > src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 9 +++-- > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 11 ++- > src/mesa/drivers/dri/i965/brw_vs.h| 6 -- > 4 files changed, 15 insertions(+), 15 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index e5db268..42d014c 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1953,8 +1953,8 @@ brw_vs_emit(struct brw_context *brw, > if (!assembly) { >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > - vec4_vs_visitor v(brw->intelScreen->compiler, brw, > -c, prog_data, prog, mem_ctx, st_index, > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, &c->key, prog_data, > +&c->vp->program, prog, mem_ctx, st_index, > !_mesa_is_gles3(&brw->ctx)); >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > if (prog) { > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > index dcbd240..d1a72d7 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > @@ -394,8 +394,7 @@ vec4_vs_visitor::emit_program_code() > * pull constants. Do that now. > */ > if (this->need_all_constants_in_pull_buffer) { > - const struct gl_program_parameter_list *params = > - vs_compile->vp->program.Base.Parameters; > + const struct gl_program_parameter_list *params = vp->Base.Parameters; >unsigned i; >for (i = 0; i < params->NumParameters * 4; i++) { > stage_prog_data->pull_param[i] = > @@ -415,8 +414,7 @@ vec4_vs_visitor::setup_vp_regs() >vp_temp_regs[i] = src_reg(this, glsl_type::vec4_type); > > /* PROGRAM_STATE_VAR etc. */ > - struct gl_program_parameter_list *plist = > - vs_compile->vp->program.Base.Parameters; > + struct gl_program_parameter_list *plist = vp->Base.Parameters; > for (unsigned p = 0; p < plist->NumParameters; p++) { >unsigned components = plist->Parameters[p].Size; > > @@ -486,8 +484,7 @@ vec4_vs_visitor::get_vp_dst_reg(const prog_dst_register > &dst) > src_reg > vec4_vs_visitor::get_vp_src_reg(const prog_src_register &src) > { > - struct gl_program_parameter_list *plist = > - vs_compile->vp->program.Base.Parameters; > + struct gl_program_parameter_list *plist = vp->Base.Parameters; > > src_reg result; > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > index 35b601a..b7ec8b9 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > @@ -36,7 +36,7 @@ vec4_vs_visitor::emit_prolog() > > for (int i = 0; i < VERT_ATTRIB_MAX; i++) { >if (vs_prog_data->inputs_read & BITFIELD64_BIT(i)) { > - uint8_t wa_flags = vs_compile->key.gl_attrib_wa_flags[i]; > + uint8_t wa_flags = key->gl_attrib_wa_flags[i]; > dst_reg reg(ATTR, i); > dst_reg reg_d = reg; > reg_d.type = BRW_REGISTER_TYPE_D; > @@ -213,20 +213,21 @@ vec4_vs_visitor::emit_thread_end() > > vec4_vs_visitor::vec4_vs_visitor(const struct brw_compiler *compiler, > void *log_data, > - struct brw_vs_compile *vs_compile, > + const struct brw_vs_prog_key *key, > struct brw_vs_prog_data *vs_prog_data, > + struct gl_vertex_program *vp, > struct gl_shader_program *prog, > void *mem_ctx, > int shader_time_index, > bool use_legacy_snorm_formula) > : vec4_visitor(compiler, log_data, > - &vs_compile->vp->program.Base, > - &vs_compile->key.base, &vs_prog_data->base, prog, > + &vp->Base, &key->base, &vs_prog_data->base, prog, >MESA_SHADER_VERTEX, >mem_ctx, false /* no_spills */, >shader_time_index), > - vs_compile(vs_compile), > + key(key), > vs_prog_data(vs_prog_data), > + vp(vp), > use_legacy_snorm_formula(use_legacy_snorm_formula) > { > } > diff --git a/src/mesa/drivers/dri/i965/brw_vs.h > b/src/mesa/drivers/dri/i965/brw_vs.h >
Re: [Mesa-dev] [PATCH 5/6] i965/vs: Remove 'c'/vs_compile from vec4_vs_visitor.
On Fri, Jul 03, 2015 at 11:29:33AM +0300, Pohjolainen, Topi wrote: > On Wed, Jul 01, 2015 at 03:03:35PM -0700, Kenneth Graunke wrote: > > At this point, the brw_vs_compile structure only contains the key and > > gl_vertex_program pointer. We may as well pass and store them directly; > > it's simpler and more convenient (key-> instead of vs_compile->key...). > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 ++-- > > src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 9 +++-- > > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 11 ++- > > src/mesa/drivers/dri/i965/brw_vs.h| 6 -- > > 4 files changed, 15 insertions(+), 15 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > index e5db268..42d014c 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > @@ -1953,8 +1953,8 @@ brw_vs_emit(struct brw_context *brw, > > if (!assembly) { > >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > > > - vec4_vs_visitor v(brw->intelScreen->compiler, brw, > > -c, prog_data, prog, mem_ctx, st_index, > > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, &c->key, > > prog_data, > > +&c->vp->program, prog, mem_ctx, st_index, > > !_mesa_is_gles3(&brw->ctx)); > >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > > if (prog) { > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > > index dcbd240..d1a72d7 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp > > @@ -394,8 +394,7 @@ vec4_vs_visitor::emit_program_code() > > * pull constants. Do that now. > > */ > > if (this->need_all_constants_in_pull_buffer) { > > - const struct gl_program_parameter_list *params = > > - vs_compile->vp->program.Base.Parameters; > > + const struct gl_program_parameter_list *params = vp->Base.Parameters; > >unsigned i; > >for (i = 0; i < params->NumParameters * 4; i++) { > > stage_prog_data->pull_param[i] = > > @@ -415,8 +414,7 @@ vec4_vs_visitor::setup_vp_regs() > >vp_temp_regs[i] = src_reg(this, glsl_type::vec4_type); > > > > /* PROGRAM_STATE_VAR etc. */ > > - struct gl_program_parameter_list *plist = > > - vs_compile->vp->program.Base.Parameters; > > + struct gl_program_parameter_list *plist = vp->Base.Parameters; > > for (unsigned p = 0; p < plist->NumParameters; p++) { > >unsigned components = plist->Parameters[p].Size; > > > > @@ -486,8 +484,7 @@ vec4_vs_visitor::get_vp_dst_reg(const prog_dst_register > > &dst) > > src_reg > > vec4_vs_visitor::get_vp_src_reg(const prog_src_register &src) > > { > > - struct gl_program_parameter_list *plist = > > - vs_compile->vp->program.Base.Parameters; > > + struct gl_program_parameter_list *plist = vp->Base.Parameters; > > > > src_reg result; > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > > index 35b601a..b7ec8b9 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp > > @@ -36,7 +36,7 @@ vec4_vs_visitor::emit_prolog() > > > > for (int i = 0; i < VERT_ATTRIB_MAX; i++) { > >if (vs_prog_data->inputs_read & BITFIELD64_BIT(i)) { > > - uint8_t wa_flags = vs_compile->key.gl_attrib_wa_flags[i]; > > + uint8_t wa_flags = key->gl_attrib_wa_flags[i]; > > dst_reg reg(ATTR, i); > > dst_reg reg_d = reg; > > reg_d.type = BRW_REGISTER_TYPE_D; > > @@ -213,20 +213,21 @@ vec4_vs_visitor::emit_thread_end() > > > > vec4_vs_visitor::vec4_vs_visitor(const struct brw_compiler *compiler, > > void *log_data, > > - struct brw_vs_compile *vs_compile, > > + const struct brw_vs_prog_key *key, > > struct brw_vs_prog_data *vs_prog_data, > > + struct gl_vertex_program *vp, > > struct gl_shader_program *prog, > > void *mem_ctx, > > int shader_time_index, > > bool use_legacy_snorm_formula) > > : vec4_visitor(compiler, log_data, > > - &vs_compile->vp->program.Base, > > - &vs_compile->key.base, &vs_prog_data->base, prog, > > + &vp->Base, &key->base, &vs_prog_data->base, prog, > >MESA_SHADER_VERTEX, > >mem_ctx, false /* no_spills */, > >shader_time_ind
Re: [Mesa-dev] [PATCH 6/6] i965/vs: Get rid of brw_vs_compile completely.
On Wed, Jul 01, 2015 at 03:03:36PM -0700, Kenneth Graunke wrote: > After tearing it out another level or two, and just passing the key and > vp directly, we can finally remove this struct. It also eliminates a > pointless memcpy() of the key. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 37 > +- > src/mesa/drivers/dri/i965/brw_vs.c | 20 -- > src/mesa/drivers/dri/i965/brw_vs.h | 13 > 3 files changed, 31 insertions(+), 39 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 42d014c..39715c4 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1872,10 +1872,11 @@ extern "C" { > */ > const unsigned * > brw_vs_emit(struct brw_context *brw, > -struct gl_shader_program *prog, > -struct brw_vs_compile *c, > -struct brw_vs_prog_data *prog_data, > void *mem_ctx, > +const struct brw_vs_prog_key *key, > +struct brw_vs_prog_data *prog_data, > +struct gl_vertex_program *vp, > +struct gl_shader_program *prog, > unsigned *final_assembly_size) > { > bool start_busy = false; > @@ -1894,29 +1895,29 @@ brw_vs_emit(struct brw_context *brw, > > int st_index = -1; > if (INTEL_DEBUG & DEBUG_SHADER_TIME) > - st_index = brw_get_shader_time_index(brw, prog, &c->vp->program.Base, > + st_index = brw_get_shader_time_index(brw, prog, &vp->Base, > ST_VS); This would now fit into the previous line. Two similar insignificant formatting nits further down, either way: Reviewed-by: Topi Pohjolainen > > if (unlikely(INTEL_DEBUG & DEBUG_VS)) > - brw_dump_ir("vertex", prog, &shader->base, &c->vp->program.Base); > + brw_dump_ir("vertex", prog, &shader->base, &vp->Base); > > if (brw->intelScreen->compiler->scalar_vs) { > - if (!c->vp->program.Base.nir) { > + if (!vp->Base.nir) { > /* Normally we generate NIR in LinkShader() or >* ProgramStringNotify(), but Mesa's fixed-function vertex program >* handling doesn't notify the driver at all. Just do it here, at >* the last minute, even though it's lame. >*/ > - assert(c->vp->program.Base.Id == 0 && prog == NULL); > - c->vp->program.Base.nir = > -brw_create_nir(brw, NULL, &c->vp->program.Base, > MESA_SHADER_VERTEX); > + assert(vp->Base.Id == 0 && prog == NULL); > + vp->Base.nir = > +brw_create_nir(brw, NULL, &vp->Base, MESA_SHADER_VERTEX); >} > >prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8; > >fs_visitor v(brw->intelScreen->compiler, brw, > - mem_ctx, MESA_SHADER_VERTEX, &c->key, > - &prog_data->base.base, prog, &c->vp->program.Base, > + mem_ctx, MESA_SHADER_VERTEX, key, > + &prog_data->base.base, prog, &vp->Base, > 8, st_index); >if (!v.run_vs(brw_select_clip_planes(&brw->ctx))) { > if (prog) { > @@ -1931,8 +1932,8 @@ brw_vs_emit(struct brw_context *brw, >} > >fs_generator g(brw->intelScreen->compiler, brw, > - mem_ctx, (void *) &c->key, &prog_data->base.base, > - &c->vp->program.Base, v.promoted_constants, > + mem_ctx, (void *) key, &prog_data->base.base, You could drop the extra space before 'key'. > + &vp->Base, v.promoted_constants, > v.runtime_check_aads_emit, "VS"); >if (INTEL_DEBUG & DEBUG_VS) { > char *name; > @@ -1942,7 +1943,7 @@ brw_vs_emit(struct brw_context *brw, > prog->Name); > } else { > name = ralloc_asprintf(mem_ctx, "vertex program %d", > - c->vp->program.Base.Id); > + vp->Base.Id); > } > g.enable_debug(name); >} > @@ -1953,8 +1954,8 @@ brw_vs_emit(struct brw_context *brw, > if (!assembly) { >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > - vec4_vs_visitor v(brw->intelScreen->compiler, brw, &c->key, prog_data, > -&c->vp->program, prog, mem_ctx, st_index, > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, key, prog_data, > +vp, prog, mem_ctx, st_index, > !_mesa_is_gles3(&brw->ctx)); >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > if (prog) { > @@ -1969,14 +1970,14 @@ brw_vs_emit(struct brw_context *brw, >} > >vec4_generator g(brw->intelScreen->compiler, brw, > - prog, &c->vp->program.Base, &prog_data->base, > +
[Mesa-dev] [PATCH 2/2] glsl: verify location when dual source blending
Same check is made for glBindFragDataLocationIndexed but it was missing when using layout qualifiers. Fixes following Piglit test: arb_blend_func_extended-output-location Signed-off-by: Tapani Pälli --- src/glsl/linker.cpp | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index aae0c0d..428ecad 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2060,6 +2060,25 @@ assign_attribute_or_color_locations(gl_shader_program *prog, } } + /* From GL4.5 core spec, section 15.2 (Shader Execution): + * + * "Output binding assignments will cause LinkProgram to fail: + * ... + * If the program has an active output assigned to a location greater + * than or equal to the value of MAX_DUAL_SOURCE_DRAW_BUFFERS and has + * an active output assigned an index greater than or equal to one;" + */ + if (target_index == MESA_SHADER_FRAGMENT && var->data.index >= 1 && + var->data.location - generic_base >= + (int) constants->MaxDualSourceDrawBuffers) { + linker_error(prog, + "output location %d >= GL_MAX_DUAL_SOURCE_DRAW_BUFFERS " + "with index %u for %s\n", + var->data.location - generic_base, var->data.index, + var->name); + return false; + } + const unsigned slots = var->type->count_attribute_slots(); /* From GL4.5 core spec, section 11.1.1 (Vertex Attributes): -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glsl: move max_index calc to assign_attribute_or_color_locations
Change function to get all gl_constants for inspection, this is used by follow-up patch. Signed-off-by: Tapani Pälli --- src/glsl/linker.cpp | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 71a45e8..aae0c0d 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -1953,9 +1953,17 @@ find_available_slots(unsigned used_mask, unsigned needed_count) */ bool assign_attribute_or_color_locations(gl_shader_program *prog, - unsigned target_index, - unsigned max_index) +struct gl_constants *constants, +unsigned target_index) { + /* FINISHME: The value of the max_attribute_index parameter is +* FINISHME: implementation dependent based on the value of +* FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be +* FINISHME: at least 16, so hardcode 16 for now. +*/ + unsigned max_index = (target_index == MESA_SHADER_VERTEX) ? 16 : + MAX2(constants->MaxDrawBuffers, constants->MaxDualSourceDrawBuffers); + /* Mark invalid locations as being used. */ unsigned used_locations = (max_index >= 32) @@ -3061,11 +3069,11 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) * FINISHME: GL_MAX_VERTEX_ATTRIBS. GL_MAX_VERTEX_ATTRIBS must be * FINISHME: at least 16, so hardcode 16 for now. */ - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 16)) { + if (!assign_attribute_or_color_locations(prog, &ctx->Const, MESA_SHADER_VERTEX)) { goto done; } - if (!assign_attribute_or_color_locations(prog, MESA_SHADER_FRAGMENT, MAX2(ctx->Const.MaxDrawBuffers, ctx->Const.MaxDualSourceDrawBuffers))) { + if (!assign_attribute_or_color_locations(prog, &ctx->Const, MESA_SHADER_FRAGMENT)) { goto done; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 3/6] i965: Enable hardware-generated binding tables on render path.
On Fri, Jul 03, 2015 at 10:00:31AM +0300, Abdiel Janulgue wrote: > +void > +gen7_enable_hw_binding_tables(struct brw_context *brw) > +{ > + if (!brw->use_resource_streamer) > + return; > + > + if (!brw->hw_bt_pool.bo) { > + /* We use a single re-usable buffer object for the lifetime of the > + * context and size it to maximum allowed binding tables that can be > + * programmed per batch: > + * > + * From the Haswell PRM, Volume 7: 3D Media GPGPU, > + * 3DSTATE_BINDING_TABLE_POOL_ALLOC > Programming Note: > + * "A maximum of 16,383 Binding tables are allowed in any batch buffer" > + */ > + static const int max_size = 16383 * 4; > + brw->hw_bt_pool.bo = drm_intel_bo_alloc(brw->bufmgr, "hw_bt", > + max_size, 64); > + brw->hw_bt_pool.next_offset = 0; > + } > + > + int pkt_len = brw->gen >= 8 ? 4 : 3; > + uint32_t dw1 = BRW_HW_BINDING_TABLE_ENABLE; > + if (brw->is_haswell) > + dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_POOL_MOCS) | > + HSW_BT_POOL_ALLOC_MUST_BE_ONE; > + else if (brw->gen >= 8) > + dw1 |= BDW_MOCS_WB; > + > + BEGIN_BATCH(pkt_len); > + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (pkt_len - 2)); > + if (brw->gen >= 8) { > + OUT_RELOC64(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); > + OUT_BATCH(brw->hw_bt_pool.bo->size); > + } else { > + OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); > + OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, > + brw->hw_bt_pool.bo->size); > + } > + ADVANCE_BATCH(); > + > + /* From the Haswell PRM, Volume 7: 3D Media GPGPU, > +* 3DSTATE_BINDING_TABLE_POOL_ALLOC > Programming Note: > +* > +* "When switching between HW and SW binding table generation, SW must > +* issue a state cache invalidate." > +*/ > + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); Should this flush not be before the enable? It's a top-of-pipe sync, which means that the enable 3DSTATE will be parsed and changing the GPU state as the previous commands are still running. Since the flush is mandatory, that implies to me that the state it changing is not latched and so still being used by those earlier commands. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads
Samuel Iglesias Gonsálvez writes: > On 29/06/15 09:11, Jordan Justen wrote: >> On 2015-06-24 07:36:24, Iago Toral wrote: >>> On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote: AFAICT the reason why this (and many of the other changes in GLSL optimization passes) is needed is because SSBO loads have been implemented as ir_expression nodes instead of being lowered into intrinsics (as other side-effectful operations do like ARB_shader_image_load_store and ARB_shader_atomic_counters). This surely broke the assumption of a number of optimization passes that ir_expression nodes behave as pure functions. I guess the reason why you've done it this way is because UBO loads were already being represented as expressions, so I see why you may have wanted to use the same approach for SSBOs even though there is a fundamental difference between the two: UBO loads have no side effects and are constant for a given set of arguments and a given shader execution, SSBO loads and stores are not. SSBO stores couldn't be accommodated into the same framework so easily, and you decided to create a separate ir node for them, what seems inconsistent with loads. Intrinsics would probably have been a good fit for both loads and stores, and would have made all these optimization changes unnecessary... P.S.: Sorry for the late reply, I was on vacation when I was CC'ed. >>> >>> Right, your assessment about the reasons behind the current >>> implementation is correct. I did not realize of these issues when I >>> decided to go with the current implementation, now it does look like >>> going with GLSL intrinsics would have made things a bit easier. I >>> suppose it would make sense to revisit the implementation in the near >>> future taking your work on arb_shader_image_load_store as a reference. >> >> While we're waiting for curro's work to land, I was hoping to review >> and let you guys land the first ~30 front end patches. These patches >> would allow some compiler tests to pass if the extension is >> overridden. (Plus, it would take a big chunk out of this large >> series.) >> >> Unfortunately, I think you should rework the load/store ops as >> intrinsics as recommended by curro. >> >> Once you have the extension working again with intrinsics, could you >> re-post the early patches before the 'i965' patches start? >> >> Does this seem like a reasonable plan? >> > Hi Samuel, > Iago and I are working on defining SSBO load/store as GLSL IR > intrinsincs. After looking at what Francisco did for > ARB_shader_image_load_store, we found some differences. > > ARB_shader_image_load_store defines imageStore() and imageLoad() as > built-in functions and they are called explicitly by GLSL shaders. In > our case SSBO load/store are implicit and, because of that, we need to > do a lowering pass when we have all the needed SSBOs information, i.e. > at link time, similar to what we did in the patch series. > > Our idea is to make that lowering pass to inject ir_call nodes replacing > the creation of ir_ssbo_store nodes and ssbo_load expressions we had before. > > In order to inject those ir_call nodes, we are thinking about doing some > steps similar to how built-in functions are defined but inside the > lowering pass: > > 1) Create ir_function_signature for the corresponding intrinsic (SSBO > store or load) > 2) Create an ir_function with the desired name and add the signature > created in first step to it. > 3) Create an ir_call node passing as argument the created > ir_function_signature and the list of variables that SSBO store/load > need to work. > 4) Add the new ir_call to the list of IR instructions. > That sounds roughly right to me. > However we don't know if this is the proper approach for several reasons: > > * As we are executing the lowering pass in link time, we don't have the > table of symbols (it was deleted before), so we cannot add the created > ir_function to it (like built-in function's definition code does). > * Creating ir_function_signature in the lowering pass doesn't seem right > to us but, as the table of symbols has been deleted, we cannot get it > from other place if it was created before. I think these should be fine because GLSL shaders are not expected to call the intrinsic explicitly so your lowering pass can just keep a set of pointers to the intrinsics privately. AFAIK there would be no use for adding the ir_function nodes to the symbol table. > * We are finding several problems implementing this approach that makes > us think that we are missing something. > I don't think you're missing anything, it's just that the GLSL front-end's infrastructure for dealing intrinsics is close to nonexistent... > As we don't see any similar implementation in the source code (i.e. no > emission of implicit intrinsics), we don't know if this is the correct > approach to follow or if there is a better way of doing it
Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads
On Fri, 2015-07-03 at 13:23 +0300, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez writes: > > > On 29/06/15 09:11, Jordan Justen wrote: > >> On 2015-06-24 07:36:24, Iago Toral wrote: > >>> On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote: > AFAICT the reason why this (and many of the other changes in GLSL > optimization passes) is needed is because SSBO loads have been > implemented as ir_expression nodes instead of being lowered into > intrinsics (as other side-effectful operations do like > ARB_shader_image_load_store and ARB_shader_atomic_counters). This > surely broke the assumption of a number of optimization passes that > ir_expression nodes behave as pure functions. I guess the reason why > you've done it this way is because UBO loads were already being > represented as expressions, so I see why you may have wanted to use the > same approach for SSBOs even though there is a fundamental difference > between the two: UBO loads have no side effects and are constant for a > given set of arguments and a given shader execution, SSBO loads and > stores are not. SSBO stores couldn't be accommodated into the same > framework so easily, and you decided to create a separate ir node for > them, what seems inconsistent with loads. Intrinsics would probably > have been a good fit for both loads and stores, and would have made all > these optimization changes unnecessary... > > P.S.: Sorry for the late reply, I was on vacation when I was CC'ed. > >>> > >>> Right, your assessment about the reasons behind the current > >>> implementation is correct. I did not realize of these issues when I > >>> decided to go with the current implementation, now it does look like > >>> going with GLSL intrinsics would have made things a bit easier. I > >>> suppose it would make sense to revisit the implementation in the near > >>> future taking your work on arb_shader_image_load_store as a reference. > >> > >> While we're waiting for curro's work to land, I was hoping to review > >> and let you guys land the first ~30 front end patches. These patches > >> would allow some compiler tests to pass if the extension is > >> overridden. (Plus, it would take a big chunk out of this large > >> series.) > >> > >> Unfortunately, I think you should rework the load/store ops as > >> intrinsics as recommended by curro. > >> > >> Once you have the extension working again with intrinsics, could you > >> re-post the early patches before the 'i965' patches start? > >> > >> Does this seem like a reasonable plan? > >> > > > Hi Samuel, > > > Iago and I are working on defining SSBO load/store as GLSL IR > > intrinsincs. After looking at what Francisco did for > > ARB_shader_image_load_store, we found some differences. > > > > ARB_shader_image_load_store defines imageStore() and imageLoad() as > > built-in functions and they are called explicitly by GLSL shaders. In > > our case SSBO load/store are implicit and, because of that, we need to > > do a lowering pass when we have all the needed SSBOs information, i.e. > > at link time, similar to what we did in the patch series. > > > > Our idea is to make that lowering pass to inject ir_call nodes replacing > > the creation of ir_ssbo_store nodes and ssbo_load expressions we had before. > > > > In order to inject those ir_call nodes, we are thinking about doing some > > steps similar to how built-in functions are defined but inside the > > lowering pass: > > > > 1) Create ir_function_signature for the corresponding intrinsic (SSBO > > store or load) > > 2) Create an ir_function with the desired name and add the signature > > created in first step to it. > > 3) Create an ir_call node passing as argument the created > > ir_function_signature and the list of variables that SSBO store/load > > need to work. > > 4) Add the new ir_call to the list of IR instructions. > > > > That sounds roughly right to me. > > > However we don't know if this is the proper approach for several reasons: > > > > * As we are executing the lowering pass in link time, we don't have the > > table of symbols (it was deleted before), so we cannot add the created > > ir_function to it (like built-in function's definition code does). > > * Creating ir_function_signature in the lowering pass doesn't seem right > > to us but, as the table of symbols has been deleted, we cannot get it > > from other place if it was created before. > > I think these should be fine because GLSL shaders are not expected to > call the intrinsic explicitly so your lowering pass can just keep a set > of pointers to the intrinsics privately. AFAIK there would be no use > for adding the ir_function nodes to the symbol table. > > > * We are finding several problems implementing this approach that makes > > us think that we are missing something. > > > I don't think you're missing anything, it's just that the GLSL > front-end's infrastructure for d
Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator
Neil Roberts writes: > There was a comment saying that in SIMD16 mode the pixel interpolator > returns coords interleaved 8 channels at a time and that this requires > extra work to support. However, this interleaved format is exactly > what the PLN instruction requires so I don't think anything needs to > be done to support it apart from removing the line to disable it and > to ensure that the message lengths for the send message are correct. > > I am more convinced that this is correct because as it says in the > comment this interleaved output is identical to what is given in the > thread payload. The code generated to apply the plane equation to > these coordinates is identical on SIMD16 and SIMD8 except that the > dispatch width is larger which implies no special unmangling is > needed. > > Perhaps the confusion stems from the fact that the description of the > PLN instruction in the IVB PRM seems to imply that the src1 inputs are > not interleaved so it wouldn't work. However, in the HSW and BDW PRMs, > the pseudo-code is different and looks like it expects the interleaved > format. Mesa doesn't seem to generate different code on IVB to > uninterleave the payload registers and everything is working so I can > only assume that the PRM is wrong. > > I tested the interpolateAt tests on HSW and did a full Piglit run on > IVB on there were no regressions. > --- > > I've CC'd Chris Forbes because according to git-annotate he wrote the > original comment so he might know something I don't. > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++ > 1 file changed, 3 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 59081ea..717e597 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1461,12 +1461,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > case nir_intrinsic_interp_var_at_centroid: > case nir_intrinsic_interp_var_at_sample: > case nir_intrinsic_interp_var_at_offset: { > - /* in SIMD16 mode, the pixel interpolator returns coords interleaved > - * 8 channels at a time, same as the barycentric coords presented in > - * the FS payload. this requires a bit of extra work to support. > - */ > - no16("interpolate_at_* not yet supported in SIMD16 mode."); > - Heh, I happened to come across this comment yesterday while looking for the remaining no16 calls and wondered why on earth it couldn't do the same that the normal interpolation code does. After this patch and a series coming up that will remove all SIMD8 fallbacks from the texturing code, the only case left still applicable to Gen7 hardware and later will be "SIMD16 explicit accumulator operands unsupported". Anyone? >fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2); > >/* For most messages, we need one reg of ignored data; the hardware > @@ -1531,7 +1525,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > bld.SEL(offset(src, i), itemp, fs_reg(7))); > } > > -mlen = 2; > +mlen = 2 * dispatch_width / 8; > inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, > dst_xy, src, > fs_reg(0u)); > } > @@ -1543,7 +1537,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr >} > >inst->mlen = mlen; > - inst->regs_written = 2; /* 2 floats per slot returned */ > + /* 2 floats per slot returned */ > + inst->regs_written = 2 * dispatch_width / 8; >inst->pi_noperspective = instr->variables[0]->var->data.interpolation > == > INTERP_QUALIFIER_NOPERSPECTIVE; > > -- > 1.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glsl: check for leading zeros in array index validation
Cc: Martin Peres Cc: Tapani Pälli --- src/glsl/linker.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 71a45e8..d8f1689 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -462,6 +462,10 @@ parse_program_resource_name(const GLchar *name, if (array_index < 0) return -1; + /* Check for leading zero */ + if (name[i] == '0' && name[i+1] != ']') + return -1; + *out_base_name_end = name + (i - 1); return array_index; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: remove duplicate array index validation and extra uniform location lookup
This change removes multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be merged saving an extra hash table lookup (and yet another validation call). Cc: Martin Peres Cc: Tapani Pälli --- This clean-up was done to allow AoA program interface query support to be implemented more easily. When applied to my latest AoA work all but one AoA program interface query subtest for the CTS now passes. No Piglit regressions. I also ran the following tests in the CTS without regression: - ES31-CTS.program_interface_query* - ES3-CTS.gtf.GL3Tests.explicit_attrib_location.explicit_attrib_location_room - ES3-CTS.gtf.GL3Tests.draw_instanced.draw_instanced_max_vertex_attribs src/mesa/main/program_resource.c | 66 ++--- src/mesa/main/shader_query.cpp | 206 --- src/mesa/main/shaderapi.h| 7 +- src/mesa/main/uniform_query.cpp | 75 -- src/mesa/main/uniforms.c | 7 +- src/mesa/main/uniforms.h | 4 - 6 files changed, 100 insertions(+), 265 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index d857b84..0a306b8 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -173,32 +173,12 @@ is_xfb_marker(const char *str) return false; } -/** - * Checks if given name index is legal for GetProgramResourceIndex, - * check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -valid_program_resource_index_name(const GLchar *name) -{ - const char *array = strstr(name, "["); - const char *close = strrchr(name, ']'); - - /* Not array, no need for the check. */ - if (!array) - return true; - - /* Last array index has to be zero. */ - if (!close || *--close != '0') - return false; - - return true; -} - GLuint GLAPIENTRY _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, const GLchar *name) { GET_CURRENT_CONTEXT(ctx); + unsigned array_index; struct gl_program_resource *res; struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, @@ -221,12 +201,9 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, case GL_PROGRAM_OUTPUT: case GL_UNIFORM: case GL_TRANSFORM_FEEDBACK_VARYING: - /* Validate name syntax for array variables */ - if (!valid_program_resource_index_name(name)) - return GL_INVALID_INDEX; - /* fall-through */ case GL_UNIFORM_BLOCK: - res = _mesa_program_resource_find_name(shProg, programInterface, name); + res = _mesa_program_resource_find_name(shProg, programInterface, name, + &array_index, true); if (!res) return GL_INVALID_INDEX; @@ -300,36 +277,6 @@ _mesa_GetProgramResourceiv(GLuint program, GLenum programInterface, propCount, props, bufSize, length, params); } -/** - * Function verifies syntax of given name for GetProgramResourceLocation - * and GetProgramResourceLocationIndex for the following cases: - * - * "array element portion of a string passed to GetProgramResourceLocation - * or GetProgramResourceLocationIndex must not have, a "+" sign, extra - * leading zeroes, or whitespace". - * - * Check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -invalid_array_element_syntax(const GLchar *name) -{ - char *first = strchr(name, '['); - char *last = strrchr(name, '['); - - if (!first) - return false; - - /* No '+' or ' ' allowed anywhere. */ - if (strchr(first, '+') || strchr(first, ' ')) - return true; - - /* Check that last array index is 0. */ - if (last[1] == '0' && last[2] != ']') - return true; - - return false; -} - static struct gl_shader_program * lookup_linked_program(GLuint program, const char *caller) { @@ -356,7 +303,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, "glGetProgramResourceLocation"); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* Validate programInterface. */ @@ -383,7 +330,8 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, _mesa_lookup_enum_by_nr(programInterface), name); } - return _mesa_program_resource_location(shProg, programInterface, name); + return _mesa_program_resource_location(shProg, programInterface, name, + true); } /** @@ -397,7 +345,7 @@ _mesa_GetProgramResourceLocationIndex(GLuint program, GLenum programInterface, struct gl_shader_program *shProg
Re: [Mesa-dev] [PATCH 2/2] mesa: remove duplicate array index validation and extra uniform location lookup
On 03/07/15 13:55, Timothy Arceri wrote: This change removes multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be merged saving an extra hash table lookup (and yet another validation call). Cc: Martin Peres Cc: Tapani Pälli --- This clean-up was done to allow AoA program interface query support to be implemented more easily. When applied to my latest AoA work all but one AoA program interface query subtest for the CTS now passes. No Piglit regressions. I also ran the following tests in the CTS without regression: - ES31-CTS.program_interface_query* - ES3-CTS.gtf.GL3Tests.explicit_attrib_location.explicit_attrib_location_room - ES3-CTS.gtf.GL3Tests.draw_instanced.draw_instanced_max_vertex_attribs src/mesa/main/program_resource.c | 66 ++--- src/mesa/main/shader_query.cpp | 206 --- src/mesa/main/shaderapi.h| 7 +- src/mesa/main/uniform_query.cpp | 75 -- src/mesa/main/uniforms.c | 7 +- src/mesa/main/uniforms.h | 4 - 6 files changed, 100 insertions(+), 265 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index d857b84..0a306b8 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -173,32 +173,12 @@ is_xfb_marker(const char *str) return false; } -/** - * Checks if given name index is legal for GetProgramResourceIndex, - * check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -valid_program_resource_index_name(const GLchar *name) -{ - const char *array = strstr(name, "["); - const char *close = strrchr(name, ']'); - - /* Not array, no need for the check. */ - if (!array) - return true; - - /* Last array index has to be zero. */ - if (!close || *--close != '0') - return false; - - return true; -} - GLuint GLAPIENTRY _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, const GLchar *name) { GET_CURRENT_CONTEXT(ctx); + unsigned array_index; struct gl_program_resource *res; struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, @@ -221,12 +201,9 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, case GL_PROGRAM_OUTPUT: case GL_UNIFORM: case GL_TRANSFORM_FEEDBACK_VARYING: - /* Validate name syntax for array variables */ - if (!valid_program_resource_index_name(name)) - return GL_INVALID_INDEX; - /* fall-through */ case GL_UNIFORM_BLOCK: - res = _mesa_program_resource_find_name(shProg, programInterface, name); + res = _mesa_program_resource_find_name(shProg, programInterface, name, + &array_index, true); if (!res) return GL_INVALID_INDEX; @@ -300,36 +277,6 @@ _mesa_GetProgramResourceiv(GLuint program, GLenum programInterface, propCount, props, bufSize, length, params); } -/** - * Function verifies syntax of given name for GetProgramResourceLocation - * and GetProgramResourceLocationIndex for the following cases: - * - * "array element portion of a string passed to GetProgramResourceLocation - * or GetProgramResourceLocationIndex must not have, a "+" sign, extra - * leading zeroes, or whitespace". - * - * Check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -invalid_array_element_syntax(const GLchar *name) -{ - char *first = strchr(name, '['); - char *last = strrchr(name, '['); - - if (!first) - return false; - - /* No '+' or ' ' allowed anywhere. */ - if (strchr(first, '+') || strchr(first, ' ')) - return true; - - /* Check that last array index is 0. */ - if (last[1] == '0' && last[2] != ']') - return true; - - return false; -} - static struct gl_shader_program * lookup_linked_program(GLuint program, const char *caller) { @@ -356,7 +303,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, "glGetProgramResourceLocation"); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* Validate programInterface. */ @@ -383,7 +330,8 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, _mesa_lookup_enum_by_nr(programInterface), name); } - return _mesa_program_resource_location(shProg, programInterface, name); + return _mesa_program_resource_location(shProg, programInterface, name, + true); } /** @@ -397,7 +345,7 @@ _mesa_GetProgra
[Mesa-dev] [PATCH] i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA
On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if the shader sends a message to the pixel interpolator. This fixes the interpolateAt* tests on SKL, apart from interpolateatsample-nonconst but that is not implemented anywhere so it's not a regression. --- src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +++ 4 files changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 3553f6e..7596139 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -415,6 +415,7 @@ struct brw_wm_prog_data { bool uses_pos_offset; bool uses_omask; bool uses_kill; + bool pulls_bary; uint32_t prog_offset_16; /** diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 66b9abc..19489ab 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -2145,6 +2145,7 @@ enum brw_pixel_shader_computed_depth_mode { # define GEN8_PSX_SHADER_DISABLES_ALPHA_TO_COVERAGE (1 << 7) # define GEN8_PSX_SHADER_IS_PER_SAMPLE (1 << 6) # define GEN8_PSX_SHADER_COMPUTES_STENCIL (1 << 5) +# define GEN9_PSX_SHADER_PULLS_BARY (1 << 3) # define GEN8_PSX_SHADER_HAS_UAV(1 << 2) # define GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK (1 << 1) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index bd71404..3ebc3a2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1481,6 +1481,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr case nir_intrinsic_interp_var_at_centroid: case nir_intrinsic_interp_var_at_sample: case nir_intrinsic_interp_var_at_offset: { + assert(stage == MESA_SHADER_FRAGMENT); + + ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true; + fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2); /* For most messages, we need one reg of ignored data; the hardware diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c index a88f109..d544509 100644 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c @@ -58,6 +58,9 @@ gen8_upload_ps_extra(struct brw_context *brw, if (prog_data->uses_omask) dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET; + if (brw->gen >= 9 && prog_data->pulls_bary) + dw1 |= GEN9_PSX_SHADER_PULLS_BARY; + if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx)) dw1 |= GEN8_PSX_SHADER_HAS_UAV; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] gallium/ttn: mark location specially in nir for color0-writes-all
On Wed, Jul 1, 2015 at 4:27 PM, Eric Anholt wrote: > Rob Clark writes: > >> From: Ilia Mirkin >> >> We need to distinguish a shader that has separate writes to each MRT >> from one which is supposed to write the data from MRT 0 to all the MRTs. >> In TGSI this is done with a property. NIR doesn't have that, so encode >> it as a funny location and decode on the other end. >> >> Signed-off-by: Ilia Mirkin > > Yeah, seems like an OK patch for now, though we really need to start > using common enums instead. yeah.. I need to spend a bit of time on kernel side of things, but after (if no one beats me to it) I was thinking to start trying to clean up the whole nir with tgsi vs glsl thing.. BR, -R > Reviewed-by: Eric Anholt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: remove duplicate array index validation and extra uniform location lookup
On Fri, 2015-07-03 at 15:08 +0300, Martin Peres wrote: > > On 03/07/15 13:55, Timothy Arceri wrote: > > This change removes multiple functions designed to validate an array > > subscript and replaces them with a call to a single function. > > > > The change also means that validation is now only done once and the index > > is retrived at the same time, as a result the getUniformLocation code can > > be merged saving an extra hash table lookup (and yet another validation > > call). > > > > Cc: Martin Peres > > Cc: Tapani Pälli > > --- > > This clean-up was done to allow AoA program interface query support to > > be > > implemented more easily. When applied to my latest AoA work all but one > > AoA program interface query subtest for the CTS now passes. > > > > No Piglit regressions. > > > > I also ran the following tests in the CTS without regression: > > > > - ES31-CTS.program_interface_query* > > - ES3 > > -CTS.gtf.GL3Tests.explicit_attrib_location.explicit_attrib_location_room > > - ES3-CTS.gtf.GL3Tests.draw_instanced.draw_instanced_max_vertex_attribs > > > > src/mesa/main/program_resource.c | 66 ++--- > > src/mesa/main/shader_query.cpp | 206 - > > -- > > src/mesa/main/shaderapi.h| 7 +- > > src/mesa/main/uniform_query.cpp | 75 -- > > src/mesa/main/uniforms.c | 7 +- > > src/mesa/main/uniforms.h | 4 - > > 6 files changed, 100 insertions(+), 265 deletions(-) > > > > diff --git a/src/mesa/main/program_resource.c > > b/src/mesa/main/program_resource.c > > index d857b84..0a306b8 100644 > > --- a/src/mesa/main/program_resource.c > > +++ b/src/mesa/main/program_resource.c > > @@ -173,32 +173,12 @@ is_xfb_marker(const char *str) > > return false; > > } > > > > -/** > > - * Checks if given name index is legal for GetProgramResourceIndex, > > - * check is written to be compatible with GL_ARB_array_of_arrays. > > - */ > > -static bool > > -valid_program_resource_index_name(const GLchar *name) > > -{ > > - const char *array = strstr(name, "["); > > - const char *close = strrchr(name, ']'); > > - > > - /* Not array, no need for the check. */ > > - if (!array) > > - return true; > > - > > - /* Last array index has to be zero. */ > > - if (!close || *--close != '0') > > - return false; > > - > > - return true; > > -} > > - > > GLuint GLAPIENTRY > > _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, > > const GLchar *name) > > { > > GET_CURRENT_CONTEXT(ctx); > > + unsigned array_index; > > struct gl_program_resource *res; > > struct gl_shader_program *shProg = > > _mesa_lookup_shader_program_err(ctx, program, > > @@ -221,12 +201,9 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum > > programInterface, > > case GL_PROGRAM_OUTPUT: > > case GL_UNIFORM: > > case GL_TRANSFORM_FEEDBACK_VARYING: > > - /* Validate name syntax for array variables */ > > - if (!valid_program_resource_index_name(name)) > > - return GL_INVALID_INDEX; > > - /* fall-through */ > > case GL_UNIFORM_BLOCK: > > - res = _mesa_program_resource_find_name(shProg, programInterface, > > name); > > + res = _mesa_program_resource_find_name(shProg, programInterface, > > name, > > + &array_index, true); > > if (!res) > >return GL_INVALID_INDEX; > > > > @@ -300,36 +277,6 @@ _mesa_GetProgramResourceiv(GLuint program, GLenum > > programInterface, > > propCount, props, bufSize, length, > > params); > > } > > > > -/** > > - * Function verifies syntax of given name for GetProgramResourceLocation > > - * and GetProgramResourceLocationIndex for the following cases: > > - * > > - * "array element portion of a string passed to > > GetProgramResourceLocation > > - * or GetProgramResourceLocationIndex must not have, a "+" sign, extra > > - * leading zeroes, or whitespace". > > - * > > - * Check is written to be compatible with GL_ARB_array_of_arrays. > > - */ > > -static bool > > -invalid_array_element_syntax(const GLchar *name) > > -{ > > - char *first = strchr(name, '['); > > - char *last = strrchr(name, '['); > > - > > - if (!first) > > - return false; > > - > > - /* No '+' or ' ' allowed anywhere. */ > > - if (strchr(first, '+') || strchr(first, ' ')) > > - return true; > > - > > - /* Check that last array index is 0. */ > > - if (last[1] == '0' && last[2] != ']') > > - return true; > > - > > - return false; > > -} > > - > > static struct gl_shader_program * > > lookup_linked_program(GLuint program, const char *caller) > > { > > @@ -356,7 +303,7 @@ _mesa_GetProgramResourceLocation(GLuint program, > > GLenum programInterface, > > struct gl_shader_program *shProg = > > lookup_linked_program(program, "glGetProgr
Re: [Mesa-dev] [PATCH] glapi: fix argument parsing in glX_proto_recv.py
On 02/07/15 18:25, Dylan Baker wrote: > One of the plugins I use with vim "helpfully" added an underscore to the > front of mode for kicks. > > Obviously this isn't a feature used very often because it's been broken > since d986cb7c70db (since May 20th), and no one has noticed. > Iirc these have a long and boring history with xserver. The short version is that they (and many other files reciding in XORG_GL{API,X}_DIR) have been dead/unused for a long while. If you want to avoid the rewriting them a few times, I would suggest that we nuke them up first. That's why I've asked about your plans earlier :) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC] egl/dri2: Add dri3 support to x11 platform
Hi Boyan, Thank you for doing this ! A few suggestions which you might be interesting: Considering that the backend has handled more than dri2 perhaps we can do a s/dri2/dri/ :-) That obviously is independent of your work. On 01/07/15 16:31, Boyan Ding wrote: > Signed-off-by: Boyan Ding > --- > configure.ac |3 + > src/egl/drivers/dri2/Makefile.am |5 + > src/egl/drivers/dri2/egl_dri2.c | 65 +- > src/egl/drivers/dri2/egl_dri2.h | 12 +- > src/egl/drivers/dri2/platform_x11.c | 127 ++- > src/egl/drivers/dri2/platform_x11_dri3.c | 1591 > ++ > src/egl/drivers/dri2/platform_x11_dri3.h | 140 +++ > 7 files changed, 1926 insertions(+), 17 deletions(-) > create mode 100644 src/egl/drivers/dri2/platform_x11_dri3.c > create mode 100644 src/egl/drivers/dri2/platform_x11_dri3.h > > diff --git a/configure.ac b/configure.ac > index af61aa2..090e6c9 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1545,6 +1545,9 @@ if test "x$enable_egl" = xyes; then > if test "x$enable_shared_glapi" = xno; then > AC_MSG_ERROR([egl_dri2 requires --enable-shared-glapi]) > fi > +if test "x$enable_dri3" = xyes; then > +EGL_LIB_DEPS="$EGL_LIB_DEPS -lxcb-dri3 -lxcb-present > -lXxf86vm -lxshmfence -lxcb-sync" Neither one of these should be a direct -lfoo expression. We should check/use PKG_CHECK_MODULES and foo_{CFLAGS,LIBS}. > +fi > else > # Avoid building an "empty" libEGL. Drop/update this > # when other backends (haiku?) come along. > diff --git a/src/egl/drivers/dri2/Makefile.am > b/src/egl/drivers/dri2/Makefile.am > index 55be4a7..d5b511c 100644 > --- a/src/egl/drivers/dri2/Makefile.am > +++ b/src/egl/drivers/dri2/Makefile.am > @@ -52,6 +52,11 @@ if HAVE_EGL_PLATFORM_X11 > libegl_dri2_la_SOURCES += platform_x11.c > AM_CFLAGS += -DHAVE_X11_PLATFORM > AM_CFLAGS += $(XCB_DRI2_CFLAGS) > +if HAVE_DRI3 > +libegl_dri2_la_SOURCES += \ > + platform_x11_dri3.c \ > + platform_x11_dri3.h > +endif > endif > > if HAVE_EGL_PLATFORM_WAYLAND > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index b1b65f7..052c579 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -322,6 +322,12 @@ struct dri2_extension_match { > int offset; > }; > > +static struct dri2_extension_match dri3_driver_extensions[] = { > + { __DRI_CORE, 1, offsetof(struct dri2_egl_display, core) }, > + { __DRI_IMAGE_DRIVER, 1, offsetof(struct dri2_egl_display, image_driver) > }, > + { NULL, 0, 0 } > +}; > + > static struct dri2_extension_match dri2_driver_extensions[] = { > { __DRI_CORE, 1, offsetof(struct dri2_egl_display, core) }, > { __DRI_DRI2, 2, offsetof(struct dri2_egl_display, dri2) }, > @@ -464,6 +470,24 @@ dri2_open_driver(_EGLDisplay *disp) > } > > EGLBoolean > +dri2_load_driver_dri3(_EGLDisplay *disp) dri3_load_driver perhaps ? > +{ > + struct dri2_egl_display *dri2_dpy = disp->DriverData; > + const __DRIextension **extensions; > + > + extensions = dri2_open_driver(disp); > + if (!extensions) > + return EGL_FALSE; > + > + if (!dri2_bind_extensions(dri2_dpy, dri3_driver_extensions, extensions)) { > + dlclose(dri2_dpy->driver); > + } > + dri2_dpy->driver_extensions = extensions; > + > + return EGL_TRUE; > +} > + > +EGLBoolean > dri2_load_driver(_EGLDisplay *disp) > { > struct dri2_egl_display *dri2_dpy = disp->DriverData; > @@ -507,7 +531,9 @@ dri2_setup_screen(_EGLDisplay *disp) > struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); > unsigned int api_mask; > > - if (dri2_dpy->dri2) { > + if (dri2_dpy->image_driver) { > + api_mask = dri2_dpy->image_driver->getAPIMask(dri2_dpy->dri_screen); > + } else if (dri2_dpy->dri2) { >api_mask = dri2_dpy->dri2->getAPIMask(dri2_dpy->dri_screen); > } else { >assert(dri2_dpy->swrast); > @@ -527,11 +553,12 @@ dri2_setup_screen(_EGLDisplay *disp) > if (api_mask & (1 << __DRI_API_GLES3)) >disp->ClientAPIs |= EGL_OPENGL_ES3_BIT_KHR; > > - assert(dri2_dpy->dri2 || dri2_dpy->swrast); > + assert(dri2_dpy->image_driver || dri2_dpy->dri2 || dri2_dpy->swrast); > disp->Extensions.KHR_surfaceless_context = EGL_TRUE; > disp->Extensions.MESA_configless_context = EGL_TRUE; > > - if ((dri2_dpy->dri2 && dri2_dpy->dri2->base.version >= 3) || > + if (dri2_dpy->image_driver || > + (dri2_dpy->dri2 && dri2_dpy->dri2->base.version >= 3) || > (dri2_dpy->swrast && dri2_dpy->swrast->base.version >= 3)) { >disp->Extensions.KHR_create_context = EGL_TRUE; > > @@ -591,7 +618,14 @@ dri2_create_screen(_EGLDisplay *disp) > > dri2_dpy = disp->DriverData; > > - if (dri2_dpy->dri2) { > + if (dri2_dpy->image_driver) { > + dri2_dpy->dri_screen = > + dri2_dpy->image_driver->cre
Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers
On 02/07/15 16:32, Jon TURNEY wrote: > On 02/07/2015 13:42, Emil Velikov wrote: >> On 27 June 2015 at 12:21, Jon TURNEY wrote: >>> This smells to me like there's something not quite right about the >>> partitioning of stuff between libmesa and libglapi, but I don't have any >>> insight into what exactly the problem is. >>> >> Quite likely. Although it would be better to resolve it, as (if my >> understanding is correct) libGL should not know anything about mesa's >> internals. It ought to interact with DRI and glapi alone. >> >> Mildly related: >> Jon what happened to the applegl-vtable branches ? I really liked the >> direction that they were going. > > I'd quite forgotten about those. > > Currently, there is a build-time choice between using apple-dri, or > indirect and swrast. This is not ideal (I believe it makes remote > clients under OSX useless), so they should all be runtime alternatives. > > Those branches are really just experiments of mine to try to understand > what that might look like before I did a similar implementation for > Windows. > You had some nice work that removed a ton of the ifdef(erry) and populated the relevant vtables. All in all it almost pushed appledri up to a first class citizen (no offense meant). > But I think I came to the conclusion that it was less work to use the > existing __GLXDRIdisplay implementations as a model, rather than convert > appple-dri to that interface and then use it as a model for my > windows-dri implementation (see [1]) > Was meaning to ask: how well is it working, does it require any changes in xserver ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] i965/vec4: Plumb log_data through so the backend_shader field gets set.
On Friday, July 03, 2015 10:50:52 AM Pohjolainen, Topi wrote: > On Wed, Jul 01, 2015 at 03:03:31PM -0700, Kenneth Graunke wrote: > > Jason plumbed this through a while back in the FS backend, but > > apparently we were just passing NULL in the vec4 backend. > > > > This patch passes brw in as intended. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- > > src/mesa/drivers/dri/i965/brw_vec4.h | 1 + > > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 10 ++ > > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 1 + > > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 3 ++- > > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 4 +++- > > src/mesa/drivers/dri/i965/brw_vs.h| 1 + > > src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 4 +++- > > 8 files changed, 18 insertions(+), 8 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > index a5c686c..2a56564 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > @@ -1940,7 +1940,7 @@ brw_vs_emit(struct brw_context *brw, > > if (!assembly) { > >prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT; > > > > - vec4_vs_visitor v(brw->intelScreen->compiler, > > + vec4_vs_visitor v(brw->intelScreen->compiler, brw, > > c, prog_data, prog, mem_ctx, st_index, > > !_mesa_is_gles3(&brw->ctx)); > >if (!v.run(brw_select_clip_planes(&brw->ctx))) { > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > > b/src/mesa/drivers/dri/i965/brw_vec4.h > > index 2ac1693..043557b 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.h > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.h > > @@ -77,6 +77,7 @@ class vec4_visitor : public backend_shader, public > > ir_visitor > > { > > public: > > vec4_visitor(const struct brw_compiler *compiler, > > +void *log_data, > > As far as I can see, all the constructors addressed in this patch are > "struct brw_context" aware. Could we use the type "struct brw_context *" > instead of "void *"? The pointer is in the end given to shader_perf_log_mesa() > which in turn unconditionally casts is to "struct brw_context *". Jason is trying to separate the compiler backend from the OpenGL driver, so we can more easily reuse it...elsewhere :) "elsewhere" does not have a brw_context, but will have some other structure. So instead he made the logging functions pass a void * closure. I'm also concerned that if we pass in brw_context that people will start using it everywhere. Admittedly, having to type log_data-> might deter them, though... signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] i965/vec4: Move perf_debug about register spilling into the visitor.
On Friday, July 03, 2015 11:11:45 AM Pohjolainen, Topi wrote: > On Wed, Jul 01, 2015 at 03:03:32PM -0700, Kenneth Graunke wrote: > > This patch makes us only issue the performance warning about register > > spilling if we actually spilled registers. We also use scratch space > > for indirect addressing and the like. > > > > This is basically commit c51163b0cf7aff0375b1a5ea4cb3da9d9e164044 for > > the vec4 backend. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/drivers/dri/i965/brw_gs.c | 4 > > src/mesa/drivers/dri/i965/brw_vec4.cpp | 16 +--- > > src/mesa/drivers/dri/i965/brw_vs.c | 4 > > 3 files changed, 13 insertions(+), 11 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c > > b/src/mesa/drivers/dri/i965/brw_gs.c > > index 52c7303..7f947e0 100644 > > --- a/src/mesa/drivers/dri/i965/brw_gs.c > > +++ b/src/mesa/drivers/dri/i965/brw_gs.c > > @@ -268,10 +268,6 @@ brw_codegen_gs_prog(struct brw_context *brw, > > > > /* Scratch space is used for register spilling */ > > if (c.base.last_scratch) { > > - perf_debug("Geometry shader triggered register spilling. " > > - "Try reducing the number of live vec4 values to " > > - "improve performance.\n"); > > - > >c.prog_data.base.base.total_scratch > > = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > index 2a56564..60f73e2 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > @@ -1827,9 +1827,19 @@ vec4_visitor::run(gl_clip_plane *clip_planes) > >} > > } > > > > - while (!reg_allocate()) { > > - if (failed) > > - return false; > > + bool allocated_without_spills = reg_allocate(); > > + > > + if (!allocated_without_spills) { > > + compiler->shader_perf_log(log_data, > > +"%s shader triggered register spilling. " > > +"Try reducing the number of live vec4 > > values " > > +"to improve performance.\n", > > +stage_name); > > + > > + while (!reg_allocate()) { > > I tried to understand a little how repeating calls to reg_allocate() differ > from previous in result wise. I didn't really get it but that doesn't really > prevent me from reviewing this patch. This patch preserves the logic while > corresponding to the intent in commit message. The interface is indeed weird. reg_allocate() may fail to allocate without spilling, at which point it will spill a register, and return false. The caller is expected to call it again to retry. I don't know why it doesn't just do that itself. > Reviewed-by: Topi Pohjolainen > > > + if (failed) > > +return false; > > + } > > } > > > > opt_schedule_instructions(); > > diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > > b/src/mesa/drivers/dri/i965/brw_vs.c > > index 6e9848f..edbcbcf 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vs.c > > +++ b/src/mesa/drivers/dri/i965/brw_vs.c > > @@ -196,10 +196,6 @@ brw_codegen_vs_prog(struct brw_context *brw, > > > > /* Scratch space is used for register spilling */ > > if (c.base.last_scratch) { > > - perf_debug("Vertex shader triggered register spilling. " > > - "Try reducing the number of live vec4 values to " > > - "improve performance.\n"); > > - > >prog_data.base.base.total_scratch > > = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); > > > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 39/78] i965/nir/vec4: Add swizzle utility method for vector ops
Hi Jason, On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote: > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev wrote: > > From: Antia Puentes > > > > For operations that have a predefined operand size > 0, defined in > > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the > > components from outside the source vector. However, the driver > > expects those components to have a swizzle value equal to the swizzle > > of the component in (number of vector elements - 1). For example, > > > >for a vec2 operation with an identity swizzle (.xy): > > - (NIR -> XYXX, driver ->X) > > > >for a vec3 operation with swizzle (.zxy) > > - (NIR-> ZXYX, driver -> ZXYY) > > Why is this needed. Was there some bug you ran into regarding this or > are you just trying to make the generated code match? If the > component isn't used, then it seems to me like the swizzle shouldn't > matter. NIR may give you bogus swizzles outside of the enabled > channels but it may do that regardless. I'm also not seeing how > composing swizzles fixes anything. There were several bugs in our NIR->vec4 code path related to NIR returning bogus swizzles outside the components of a vector. When doing an "any", "equal" or "notEqual" operation on vectors, the result of the instruction "cmp" is predicated using "all4h" or "any4h" depending on the operation. The results were incorrect for vectors with less than 4 elements because: (1) I was not setting the writemask of the "cmp" operation according to the vector size and (2) NIR returns bogus swizzles for components outside the vector. As a consequence, we had spurious component-wise comparisons that affected the result of the predication. As composing the operand swizzle with the identity swizzle-per-size means to repeat the value of the swizzle of the last vector-element on the components outside the vector, the error mentioned above does not happen because to repeat the last (inside-boundaries) component-wise comparison for the components outside the vector does not affect the result of the predication. Other way to fix the bugs above is to set the writemask of the "cmp" operation to the appropriate mask according to the vector size, thus forcing the "cmp" instruction to compare only the components inside the vector boundaries. As the vec4_visitor does not set the writemask of the "cmp" instruction because it relies on the operand having the correct swizzle outside the vector boundaries, for consistency with the existing behaviour, I decided to do the same and compose the swizzle (notice that the composing is done in the vec4_visitor when visiting an ir_swizzle, while in NIR we only set the swizzle for the elements inside the vector). I can remove this function if you prefer setting the writemask over composing the swizzles. I remember another regression related to bogus swizzles in unpackHalf2x16, for that function the operand is an uint, however in the vec4_visitor::emit_unpack_half_2x16 function we have: tmp_dst.writemask = WRITEMASK_Y; emit(SHR(tmp_dst, src0, src_reg(16u))); This was not buggy in the old vec4_visitor because the src0's swizzle was correct for the component Y (src0's swizzle would be either , the identity swizzle for size == 1, or the result of composing the swizzle with in case the parent IR node was an ir_swizzle). In the case of NIR, as the swizzle can be bogus for all the components except the first one (size == 1), we need to compose it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl: add a missing call to _mesa_locale_init
On Fri, Jul 3, 2015 at 12:46 AM, Erik Faye-Lund wrote: > After c61bc6e ("util: port _mesa_strto[df] to C"), "make check" > fails due to a missing _mesa_locale_init. Fixup this oversight, > by moving the stand-alone compiler initializer inside > initialize_context_to_defaults(). > > Signed-off-by: Erik Faye-Lund > --- > Here's what the latter suggestion would look like. Personally, > I have a slight preference for this version. Thanks, I've pushed this instead. Reviewed-by: Matt Turner http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA
On Fri, Jul 03, 2015 at 01:15:21PM +0100, Neil Roberts wrote: > On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if > the shader sends a message to the pixel interpolator. This fixes the > interpolateAt* tests on SKL, apart from interpolateatsample-nonconst > but that is not implemented anywhere so it's not a regression. > --- > src/mesa/drivers/dri/i965/brw_context.h | 1 + > src/mesa/drivers/dri/i965/brw_defines.h | 1 + > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 > src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +++ > 4 files changed, 9 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 3553f6e..7596139 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -415,6 +415,7 @@ struct brw_wm_prog_data { > bool uses_pos_offset; > bool uses_omask; > bool uses_kill; > + bool pulls_bary; > uint32_t prog_offset_16; > > /** > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 66b9abc..19489ab 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -2145,6 +2145,7 @@ enum brw_pixel_shader_computed_depth_mode { > # define GEN8_PSX_SHADER_DISABLES_ALPHA_TO_COVERAGE (1 << 7) > # define GEN8_PSX_SHADER_IS_PER_SAMPLE (1 << 6) > # define GEN8_PSX_SHADER_COMPUTES_STENCIL (1 << 5) > +# define GEN9_PSX_SHADER_PULLS_BARY (1 << 3) > # define GEN8_PSX_SHADER_HAS_UAV(1 << 2) > # define GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK (1 << 1) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index bd71404..3ebc3a2 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1481,6 +1481,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > case nir_intrinsic_interp_var_at_centroid: > case nir_intrinsic_interp_var_at_sample: > case nir_intrinsic_interp_var_at_offset: { > + assert(stage == MESA_SHADER_FRAGMENT); > + > + ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true; > + >fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2); > >/* For most messages, we need one reg of ignored data; the hardware > diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c > b/src/mesa/drivers/dri/i965/gen8_ps_state.c > index a88f109..d544509 100644 > --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c > @@ -58,6 +58,9 @@ gen8_upload_ps_extra(struct brw_context *brw, > if (prog_data->uses_omask) >dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET; > > + if (brw->gen >= 9 && prog_data->pulls_bary) > + dw1 |= GEN9_PSX_SHADER_PULLS_BARY; > + > if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx)) >dw1 |= GEN8_PSX_SHADER_HAS_UAV; > It's unclear to me what the downside to always setting this bit would be (I assume that's the behavior of previous gens). I also assume this means you're abandoning the other patch, or doing it on top of this, else you don't want to do it for the centroid case. Reviewed-by: Ben Widawsky ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA
On Friday, July 03, 2015 01:15:21 PM Neil Roberts wrote: > On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if > the shader sends a message to the pixel interpolator. This fixes the > interpolateAt* tests on SKL, apart from interpolateatsample-nonconst > but that is not implemented anywhere so it's not a regression. > --- > src/mesa/drivers/dri/i965/brw_context.h | 1 + > src/mesa/drivers/dri/i965/brw_defines.h | 1 + > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 > src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +++ > 4 files changed, 9 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 3553f6e..7596139 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -415,6 +415,7 @@ struct brw_wm_prog_data { > bool uses_pos_offset; > bool uses_omask; > bool uses_kill; > + bool pulls_bary; > uint32_t prog_offset_16; > > /** > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 66b9abc..19489ab 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -2145,6 +2145,7 @@ enum brw_pixel_shader_computed_depth_mode { > # define GEN8_PSX_SHADER_DISABLES_ALPHA_TO_COVERAGE (1 << 7) > # define GEN8_PSX_SHADER_IS_PER_SAMPLE (1 << 6) > # define GEN8_PSX_SHADER_COMPUTES_STENCIL (1 << 5) > +# define GEN9_PSX_SHADER_PULLS_BARY (1 << 3) > # define GEN8_PSX_SHADER_HAS_UAV(1 << 2) > # define GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK (1 << 1) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index bd71404..3ebc3a2 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1481,6 +1481,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > case nir_intrinsic_interp_var_at_centroid: > case nir_intrinsic_interp_var_at_sample: > case nir_intrinsic_interp_var_at_offset: { > + assert(stage == MESA_SHADER_FRAGMENT); > + > + ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true; > + >fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2); > >/* For most messages, we need one reg of ignored data; the hardware > diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c > b/src/mesa/drivers/dri/i965/gen8_ps_state.c > index a88f109..d544509 100644 > --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c > @@ -58,6 +58,9 @@ gen8_upload_ps_extra(struct brw_context *brw, > if (prog_data->uses_omask) >dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET; > > + if (brw->gen >= 9 && prog_data->pulls_bary) > + dw1 |= GEN9_PSX_SHADER_PULLS_BARY; > + > if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx)) >dw1 |= GEN8_PSX_SHADER_HAS_UAV; > > Good find! That explains a lot. Cc: "10.6 10.5" Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glapi: fix argument parsing in glX_proto_recv.py
I asked about dumping them and was told that they would leave mesa only when x itself died. If be more than happy to see them go if they're not useful On Jul 3, 2015 06:28, "Emil Velikov" wrote: > On 02/07/15 18:25, Dylan Baker wrote: > > One of the plugins I use with vim "helpfully" added an underscore to the > > front of mode for kicks. > > > > Obviously this isn't a feature used very often because it's been broken > > since d986cb7c70db (since May 20th), and no one has noticed. > > > Iirc these have a long and boring history with xserver. The short > version is that they (and many other files reciding in > XORG_GL{API,X}_DIR) have been dead/unused for a long while. > > If you want to avoid the rewriting them a few times, I would suggest > that we nuke them up first. That's why I've asked about your plans > earlier :) > > -Emil > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91222] lp_test_format regression on CentOS 7
https://bugs.freedesktop.org/show_bug.cgi?id=91222 Bug ID: 91222 Summary: lp_test_format regression on CentOS 7 Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: airl...@freedesktop.org, bri...@vmware.com, jfons...@vmware.com, srol...@vmware.com mesa: 83984f134b4a1e2829cb238c404bc82c98be6082 (master 10.7.0-devel) llvm: llvm-3.4.2-7.el7.x86_64 $ ./build/linux-x86_64-debug/bin/lp_test_format couldn't open libtxc_dxtn.so, software DXTn compression/decompression unavailable Testing PIPE_FORMAT_B8G8R8A8_UNORM (float) ... Testing PIPE_FORMAT_B8G8R8A8_UNORM (unorm8) ... Testing PIPE_FORMAT_B8G8R8X8_UNORM (float) ... Testing PIPE_FORMAT_B8G8R8X8_UNORM (unorm8) ... Testing PIPE_FORMAT_A8R8G8B8_UNORM (float) ... Testing PIPE_FORMAT_A8R8G8B8_UNORM (unorm8) ... Testing PIPE_FORMAT_X8R8G8B8_UNORM (float) ... Testing PIPE_FORMAT_X8R8G8B8_UNORM (unorm8) ... Testing PIPE_FORMAT_B5G5R5A1_UNORM (float) ... Testing PIPE_FORMAT_B5G5R5A1_UNORM (unorm8) ... Testing PIPE_FORMAT_B4G4R4A4_UNORM (float) ... Testing PIPE_FORMAT_B4G4R4A4_UNORM (unorm8) ... Testing PIPE_FORMAT_B5G6R5_UNORM (float) ... Testing PIPE_FORMAT_B5G6R5_UNORM (unorm8) ... Testing PIPE_FORMAT_R10G10B10A2_UNORM (float) ... Testing PIPE_FORMAT_R10G10B10A2_UNORM (unorm8) ... Testing PIPE_FORMAT_L8_UNORM (float) ... Testing PIPE_FORMAT_L8_UNORM (unorm8) ... Testing PIPE_FORMAT_A8_UNORM (float) ... Testing PIPE_FORMAT_A8_UNORM (unorm8) ... Testing PIPE_FORMAT_I8_UNORM (float) ... Testing PIPE_FORMAT_I8_UNORM (unorm8) ... Testing PIPE_FORMAT_L8A8_UNORM (float) ... Testing PIPE_FORMAT_L8A8_UNORM (unorm8) ... Testing PIPE_FORMAT_L16_UNORM (float) ... Testing PIPE_FORMAT_L16_UNORM (unorm8) ... Testing PIPE_FORMAT_UYVY (float) ... Testing PIPE_FORMAT_UYVY (unorm8) ... Testing PIPE_FORMAT_YUYV (float) ... Testing PIPE_FORMAT_YUYV (unorm8) ... Testing PIPE_FORMAT_R32_FLOAT (float) ... Testing PIPE_FORMAT_R32_FLOAT (unorm8) ... Testing PIPE_FORMAT_R32G32_FLOAT (float) ... Testing PIPE_FORMAT_R32G32_FLOAT (unorm8) ... Testing PIPE_FORMAT_R32G32B32_FLOAT (float) ... Testing PIPE_FORMAT_R32G32B32_FLOAT (unorm8) ... Testing PIPE_FORMAT_R32G32B32A32_FLOAT (float) ... Testing PIPE_FORMAT_R32G32B32A32_FLOAT (unorm8) ... Testing PIPE_FORMAT_R32_UNORM (float) ... Testing PIPE_FORMAT_R32_UNORM (unorm8) ... Testing PIPE_FORMAT_R32G32_UNORM (float) ... Testing PIPE_FORMAT_R32G32_UNORM (unorm8) ... Testing PIPE_FORMAT_R32G32B32_UNORM (float) ... Testing PIPE_FORMAT_R32G32B32_UNORM (unorm8) ... Testing PIPE_FORMAT_R32G32B32A32_UNORM (float) ... Testing PIPE_FORMAT_R32G32B32A32_UNORM (unorm8) ... Testing PIPE_FORMAT_R32_USCALED (float) ... Testing PIPE_FORMAT_R32_USCALED (unorm8) ... Testing PIPE_FORMAT_R32G32_USCALED (float) ... Testing PIPE_FORMAT_R32G32_USCALED (unorm8) ... Testing PIPE_FORMAT_R32G32B32_USCALED (float) ... Testing PIPE_FORMAT_R32G32B32_USCALED (unorm8) ... Testing PIPE_FORMAT_R32G32B32A32_USCALED (float) ... Testing PIPE_FORMAT_R32G32B32A32_USCALED (unorm8) ... Testing PIPE_FORMAT_R32_SNORM (float) ... Testing PIPE_FORMAT_R32_SNORM (unorm8) ... Testing PIPE_FORMAT_R32G32_SNORM (float) ... Testing PIPE_FORMAT_R32G32_SNORM (unorm8) ... Testing PIPE_FORMAT_R32G32B32_SNORM (float) ... Testing PIPE_FORMAT_R32G32B32_SNORM (unorm8) ... Testing PIPE_FORMAT_R32G32B32A32_SNORM (float) ... Testing PIPE_FORMAT_R32G32B32A32_SNORM (unorm8) ... Testing PIPE_FORMAT_R32_SSCALED (float) ... Testing PIPE_FORMAT_R32_SSCALED (unorm8) ... Testing PIPE_FORMAT_R32G32_SSCALED (float) ... Testing PIPE_FORMAT_R32G32_SSCALED (unorm8) ... Testing PIPE_FORMAT_R32G32B32_SSCALED (float) ... Testing PIPE_FORMAT_R32G32B32_SSCALED (unorm8) ... Testing PIPE_FORMAT_R32G32B32A32_SSCALED (float) ... Testing PIPE_FORMAT_R32G32B32A32_SSCALED (unorm8) ... Testing PIPE_FORMAT_R16_UNORM (float) ... Testing PIPE_FORMAT_R16_UNORM (unorm8) ... Testing PIPE_FORMAT_R16G16_UNORM (float) ... Testing PIPE_FORMAT_R16G16_UNORM (unorm8) ... Testing PIPE_FORMAT_R16G16B16_UNORM (float) ... Testing PIPE_FORMAT_R16G16B16_UNORM (unorm8) ... Testing PIPE_FORMAT_R16G16B16A16_UNORM (float) ... Testing PIPE_FORMAT_R16G16B16A16_UNORM (unorm8) ... Testing PIPE_FORMAT_R16_USCALED (float) ... Testing PIPE_FORMAT_R16_USCALED (unorm8) ... Testing PIPE_FORMAT_R16G16_USCALED (float) ... Testing PIPE_FORMAT_R16G16_USCALED (unorm8) ... Testing PIPE_FORMAT_R16G16B16_USCALED (float) ... Both operands to a binary operator are not of the same type! %10 = and <4 x i16> %7, <4 x i32> %9 Both operands to a binary operator are not of the same type! %12 = and <4 x i16> , <4 x i32> %11 Broken module found, verification continues.
[Mesa-dev] [PATCH V2 4/4] mesa: remove now unused _mesa_get_uniform_location
Cc: Tapani Pälli --- src/mesa/main/uniform_query.cpp | 75 - src/mesa/main/uniforms.h| 4 --- 2 files changed, 79 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index cab5083..680ba16 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -978,81 +978,6 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, } -/** - * Called via glGetUniformLocation(). - * - * Returns the uniform index into UniformStorage (also the - * glGetActiveUniformsiv uniform index), and stores the referenced - * array offset in *offset, or GL_INVALID_INDEX (-1). - */ -extern "C" unsigned -_mesa_get_uniform_location(struct gl_shader_program *shProg, - const GLchar *name, - unsigned *out_offset) -{ - /* Page 80 (page 94 of the PDF) of the OpenGL 2.1 spec says: -* -* "The first element of a uniform array is identified using the -* name of the uniform array appended with "[0]". Except if the last -* part of the string name indicates a uniform array, then the -* location of the first element of that array can be retrieved by -* either using the name of the uniform array, or the name of the -* uniform array appended with "[0]"." -* -* Note: since uniform names are not allowed to use whitespace, and array -* indices within uniform names are not allowed to use "+", "-", or leading -* zeros, it follows that each uniform has a unique name up to the possible -* ambiguity with "[0]" noted above. Therefore we don't need to worry -* about mal-formed inputs--they will properly fail when we try to look up -* the uniform name in shProg->UniformHash. -*/ - - const GLchar *base_name_end; - long offset = parse_program_resource_name(name, &base_name_end); - bool array_lookup = offset >= 0; - char *name_copy; - - if (array_lookup) { - name_copy = (char *) malloc(base_name_end - name + 1); - memcpy(name_copy, name, base_name_end - name); - name_copy[base_name_end - name] = '\0'; - } else { - name_copy = (char *) name; - offset = 0; - } - - unsigned location = 0; - const bool found = shProg->UniformHash->get(location, name_copy); - - assert(!found - || strcmp(name_copy, shProg->UniformStorage[location].name) == 0); - - /* Free the temporary buffer *before* possibly returning an error. -*/ - if (name_copy != name) - free(name_copy); - - if (!found) - return GL_INVALID_INDEX; - - /* If the uniform is built-in, fail. */ - if (shProg->UniformStorage[location].builtin) - return GL_INVALID_INDEX; - - /* If the uniform is an array, fail if the index is out of bounds. -* (A negative index is caught above.) This also fails if the uniform -* is not an array, but the user is trying to index it, because -* array_elements is zero and offset >= 0. -*/ - if (array_lookup - && offset >= (long) shProg->UniformStorage[location].array_elements) { - return GL_INVALID_INDEX; - } - - *out_offset = offset; - return location; -} - extern "C" bool _mesa_sampler_uniforms_are_valid(const struct gl_shader_program *shProg, char *errMsg, size_t errMsgLength) diff --git a/src/mesa/main/uniforms.h b/src/mesa/main/uniforms.h index bd7b05e..e62eaa5 100644 --- a/src/mesa/main/uniforms.h +++ b/src/mesa/main/uniforms.h @@ -343,10 +343,6 @@ void GLAPIENTRY _mesa_ProgramUniformMatrix4x3dv(GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLdouble *value); -unsigned -_mesa_get_uniform_location(struct gl_shader_program *shProg, - const GLchar *name, unsigned *offset); - void _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shader_program, GLint location, GLsizei count, -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2 3/4] mesa: remove overlapping array index validation and extra uniform location lookup
This change removes multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be simplified saving an extra hash table lookup (and yet another validation call). V2: Fix bounds checks for program input/output, split unrelated comment fix and _mesa_get_uniform_location() removal into their own patch. Cc: Tapani Pälli --- src/mesa/main/program_resource.c | 66 ++--- src/mesa/main/shader_query.cpp | 204 --- src/mesa/main/shaderapi.h| 7 +- src/mesa/main/uniforms.c | 7 +- 4 files changed, 100 insertions(+), 184 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index d857b84..0a306b8 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -173,32 +173,12 @@ is_xfb_marker(const char *str) return false; } -/** - * Checks if given name index is legal for GetProgramResourceIndex, - * check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -valid_program_resource_index_name(const GLchar *name) -{ - const char *array = strstr(name, "["); - const char *close = strrchr(name, ']'); - - /* Not array, no need for the check. */ - if (!array) - return true; - - /* Last array index has to be zero. */ - if (!close || *--close != '0') - return false; - - return true; -} - GLuint GLAPIENTRY _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, const GLchar *name) { GET_CURRENT_CONTEXT(ctx); + unsigned array_index; struct gl_program_resource *res; struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, @@ -221,12 +201,9 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, case GL_PROGRAM_OUTPUT: case GL_UNIFORM: case GL_TRANSFORM_FEEDBACK_VARYING: - /* Validate name syntax for array variables */ - if (!valid_program_resource_index_name(name)) - return GL_INVALID_INDEX; - /* fall-through */ case GL_UNIFORM_BLOCK: - res = _mesa_program_resource_find_name(shProg, programInterface, name); + res = _mesa_program_resource_find_name(shProg, programInterface, name, + &array_index, true); if (!res) return GL_INVALID_INDEX; @@ -300,36 +277,6 @@ _mesa_GetProgramResourceiv(GLuint program, GLenum programInterface, propCount, props, bufSize, length, params); } -/** - * Function verifies syntax of given name for GetProgramResourceLocation - * and GetProgramResourceLocationIndex for the following cases: - * - * "array element portion of a string passed to GetProgramResourceLocation - * or GetProgramResourceLocationIndex must not have, a "+" sign, extra - * leading zeroes, or whitespace". - * - * Check is written to be compatible with GL_ARB_array_of_arrays. - */ -static bool -invalid_array_element_syntax(const GLchar *name) -{ - char *first = strchr(name, '['); - char *last = strrchr(name, '['); - - if (!first) - return false; - - /* No '+' or ' ' allowed anywhere. */ - if (strchr(first, '+') || strchr(first, ' ')) - return true; - - /* Check that last array index is 0. */ - if (last[1] == '0' && last[2] != ']') - return true; - - return false; -} - static struct gl_shader_program * lookup_linked_program(GLuint program, const char *caller) { @@ -356,7 +303,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, "glGetProgramResourceLocation"); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* Validate programInterface. */ @@ -383,7 +330,8 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, _mesa_lookup_enum_by_nr(programInterface), name); } - return _mesa_program_resource_location(shProg, programInterface, name); + return _mesa_program_resource_location(shProg, programInterface, name, + true); } /** @@ -397,7 +345,7 @@ _mesa_GetProgramResourceLocationIndex(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, "glGetProgramResourceLocationIndex"); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* From the GL_ARB_program_interface_query spec: diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index 0473c2e..3b08ea1 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -44,7 +44,8 @@ extern "C" { st
[Mesa-dev] [PATCH V2 2/4] mesa: fix incorrect comment
Cc: Tapani Pälli --- src/mesa/main/shader_query.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index a6246a3..0473c2e 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -787,8 +787,6 @@ program_resource_location(struct gl_shader_program *shProg, /** * Function implements following location queries: - *glGetAttribLocation - *glGetFragDataLocation *glGetUniformLocation */ GLint -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2 1/4] glsl: check for leading zeros in array index validation
Cc: Tapani Pälli --- src/glsl/linker.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 71a45e8..d8f1689 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -462,6 +462,10 @@ parse_program_resource_name(const GLchar *name, if (array_index < 0) return -1; + /* Check for leading zero */ + if (name[i] == '0' && name[i+1] != ']') + return -1; + *out_base_name_end = name + (i - 1); return array_index; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: fix lp_build_compare_ext
From: Roland Scheidegger The expansion should always be to the same width as the input arguments no matter what, since these functions should work with any bit width of the arguments (the sext is a no-op on any sane simd architecture). Thus, fix the caller expecting differently. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=91222 (not tested otherwise) --- src/gallium/auxiliary/gallivm/lp_bld_logic.c | 2 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_logic.c b/src/gallium/auxiliary/gallivm/lp_bld_logic.c index f724cfa..80b53e5 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_logic.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_logic.c @@ -81,7 +81,7 @@ lp_build_compare_ext(struct gallivm_state *gallivm, boolean ordered) { LLVMBuilderRef builder = gallivm->builder; - LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, lp_type_int_vec(32, 32 * type.length)); + LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, type); LLVMValueRef zeros = LLVMConstNull(int_vec_type); LLVMValueRef ones = LLVMConstAllOnes(int_vec_type); LLVMValueRef cond; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c index 1f2af85..0ad78b0 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c @@ -1961,8 +1961,11 @@ dset_emit_cpu( struct lp_build_emit_data * emit_data, unsigned pipe_func) { + LLVMBuilderRef builder = bld_base->base.gallivm->builder; LLVMValueRef cond = lp_build_cmp(&bld_base->dbl_bld, pipe_func, emit_data->args[0], emit_data->args[1]); + /* arguments were 64 bit but store as 32 bit */ + cond = LLVMBuildTrunc(builder, cond, bld_base->int_bld.int_vec_type, ""); emit_data->output[emit_data->chan] = cond; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/19] glsl: allow precision qualifiers for AoA
Reviewed-by: Ilia Mirkin On Sat, Jun 20, 2015 at 8:33 AM, Timothy Arceri wrote: > --- > src/glsl/ast_to_hir.cpp | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index f1c3e4a..0d3cbac 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -3881,9 +3881,7 @@ ast_declarator_list::hir(exec_list *instructions, > * an array of that type. > */ >if (!(this->type->qualifier.precision == ast_precision_none > - || precision_qualifier_allowed(var->type) > - || (var->type->is_array() > - && precision_qualifier_allowed(var->type->fields.array { > + || precision_qualifier_allowed(var->type->without_array( { > > _mesa_glsl_error(&loc, state, >"precision qualifiers apply only to floating point" > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC] egl/dri2: Add dri3 support to x11 platform
Hi Emil, On 07/03/2015 10:36 PM, Emil Velikov wrote: > Hi Boyan, > > Thank you for doing this ! A few suggestions which you might be interesting: > > Considering that the backend has handled more than dri2 perhaps we can > do a s/dri2/dri/ :-) That obviously is independent of your work. > > On 01/07/15 16:31, Boyan Ding wrote: >> Signed-off-by: Boyan Ding >> --- >> configure.ac |3 + >> src/egl/drivers/dri2/Makefile.am |5 + >> src/egl/drivers/dri2/egl_dri2.c | 65 +- >> src/egl/drivers/dri2/egl_dri2.h | 12 +- >> src/egl/drivers/dri2/platform_x11.c | 127 ++- >> src/egl/drivers/dri2/platform_x11_dri3.c | 1591 >> ++ >> src/egl/drivers/dri2/platform_x11_dri3.h | 140 +++ >> 7 files changed, 1926 insertions(+), 17 deletions(-) >> create mode 100644 src/egl/drivers/dri2/platform_x11_dri3.c >> create mode 100644 src/egl/drivers/dri2/platform_x11_dri3.h >> >> diff --git a/configure.ac b/configure.ac >> index af61aa2..090e6c9 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -1545,6 +1545,9 @@ if test "x$enable_egl" = xyes; then >> if test "x$enable_shared_glapi" = xno; then >> AC_MSG_ERROR([egl_dri2 requires --enable-shared-glapi]) >> fi >> +if test "x$enable_dri3" = xyes; then >> +EGL_LIB_DEPS="$EGL_LIB_DEPS -lxcb-dri3 -lxcb-present >> -lXxf86vm -lxshmfence -lxcb-sync" > Neither one of these should be a direct -lfoo expression. We should > check/use PKG_CHECK_MODULES and foo_{CFLAGS,LIBS}. I'll correct that. > >> +fi >> else >> # Avoid building an "empty" libEGL. Drop/update this >> # when other backends (haiku?) come along. >> diff --git a/src/egl/drivers/dri2/Makefile.am >> b/src/egl/drivers/dri2/Makefile.am >> index 55be4a7..d5b511c 100644 >> --- a/src/egl/drivers/dri2/Makefile.am >> +++ b/src/egl/drivers/dri2/Makefile.am >> @@ -52,6 +52,11 @@ if HAVE_EGL_PLATFORM_X11 >> libegl_dri2_la_SOURCES += platform_x11.c >> AM_CFLAGS += -DHAVE_X11_PLATFORM >> AM_CFLAGS += $(XCB_DRI2_CFLAGS) >> +if HAVE_DRI3 >> +libegl_dri2_la_SOURCES += \ >> +platform_x11_dri3.c \ >> +platform_x11_dri3.h >> +endif >> endif >> >> if HAVE_EGL_PLATFORM_WAYLAND >> diff --git a/src/egl/drivers/dri2/egl_dri2.c >> b/src/egl/drivers/dri2/egl_dri2.c >> index b1b65f7..052c579 100644 >> --- a/src/egl/drivers/dri2/egl_dri2.c >> +++ b/src/egl/drivers/dri2/egl_dri2.c >> @@ -322,6 +322,12 @@ struct dri2_extension_match { >> int offset; >> }; >> >> +static struct dri2_extension_match dri3_driver_extensions[] = { >> + { __DRI_CORE, 1, offsetof(struct dri2_egl_display, core) }, >> + { __DRI_IMAGE_DRIVER, 1, offsetof(struct dri2_egl_display, image_driver) >> }, >> + { NULL, 0, 0 } >> +}; >> + >> static struct dri2_extension_match dri2_driver_extensions[] = { >> { __DRI_CORE, 1, offsetof(struct dri2_egl_display, core) }, >> { __DRI_DRI2, 2, offsetof(struct dri2_egl_display, dri2) }, >> @@ -464,6 +470,24 @@ dri2_open_driver(_EGLDisplay *disp) >> } >> >> EGLBoolean >> +dri2_load_driver_dri3(_EGLDisplay *disp) > dri3_load_driver perhaps ? I prefixed all my functions in public files with "dri2" instead of "dri3" because the EGL driver is currently called "DRI2" (although it seems funny to see dri3 in a driver called "dri2"). The current name of the driver is "DRI2" because it uses dri2 to talk with the X server and uses similar technologies with direct rendering on other platforms. With my patch, the first reason becomes invalid. But should the name of the driver or namespace of all functions (or just the functions involved with dri3) change? I'm open to others' suggestions. The same applies to comments on function names below. > >> [snip] >> >> diff --git a/src/egl/drivers/dri2/egl_dri2.h >> b/src/egl/drivers/dri2/egl_dri2.h >> index f0cc6da..d258753 100644 >> --- a/src/egl/drivers/dri2/egl_dri2.h >> +++ b/src/egl/drivers/dri2/egl_dri2.h >> @@ -153,11 +153,16 @@ struct dri2_egl_display >> >> int dri2_major; >> int dri2_minor; >> + int dri3_major; >> + int dri3_minor; >> + int present_major; >> + int present_minor; > Many of these are unused. Same goes for the glx code which was the > inspiration for this work :-) I think it's safe to remove them then. > >> __DRIscreen *dri_screen; >> int own_dri_screen; >> const __DRIconfig **driver_configs; >> void *driver; >> const __DRIcoreExtension *core; >> + const __DRIimageDriverExtension *image_driver; >> const __DRIdri2Extension *dri2; >> const __DRIswrastExtension *swrast; >> const __DRI2flushExtension *flush; >> @@ -189,6 +194,7 @@ struct dri2_egl_display >> #ifdef HAVE_
[Mesa-dev] [Bug 90162] glGetFramebufferAttachmentParameteriv failing if certain FB attributes are zero
https://bugs.freedesktop.org/show_bug.cgi?id=90162 Tapani Pälli changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |NOTABUG --- Comment #10 from Tapani Pälli --- I'm resolving this as per comment #5 -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix lp_build_compare_ext
On Fri, Jul 3, 2015 at 6:05 PM, wrote: > From: Roland Scheidegger > > The expansion should always be to the same width as the input arguments > no matter what, since these functions should work with any bit width of > the arguments (the sext is a no-op on any sane simd architecture). > Thus, fix the caller expecting differently. > > This fixes https://bugs.freedesktop.org/show_bug.cgi?id=91222 (not tested > otherwise) > --- > src/gallium/auxiliary/gallivm/lp_bld_logic.c | 2 +- > src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 3 +++ > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_logic.c > b/src/gallium/auxiliary/gallivm/lp_bld_logic.c > index f724cfa..80b53e5 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_logic.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_logic.c > @@ -81,7 +81,7 @@ lp_build_compare_ext(struct gallivm_state *gallivm, > boolean ordered) > { > LLVMBuilderRef builder = gallivm->builder; > - LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, > lp_type_int_vec(32, 32 * type.length)); > + LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, type); > LLVMValueRef zeros = LLVMConstNull(int_vec_type); > LLVMValueRef ones = LLVMConstAllOnes(int_vec_type); > LLVMValueRef cond; > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c > b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c > index 1f2af85..0ad78b0 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c > @@ -1961,8 +1961,11 @@ dset_emit_cpu( > struct lp_build_emit_data * emit_data, > unsigned pipe_func) > { > + LLVMBuilderRef builder = bld_base->base.gallivm->builder; > LLVMValueRef cond = lp_build_cmp(&bld_base->dbl_bld, pipe_func, > emit_data->args[0], emit_data->args[1]); > + /* arguments were 64 bit but store as 32 bit */ > + cond = LLVMBuildTrunc(builder, cond, bld_base->int_bld.int_vec_type, ""); > emit_data->output[emit_data->chan] = cond; > } > > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Tested-by: Vinson Lee ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: set stage flag for structs and arrays in resource list
This fixes: ES31-CTS.program_interface_query.uniform-types Cc: Tapani Pälli --- Note: This only fixes the remaining CTS subtests for uniform-types the other fixes are part of my clean-up series: http://lists.freedesktop.org/archives/mesa-dev/2015-July/088122.html src/glsl/linker.cpp | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 0d4bc15..6a69c15 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2648,9 +2648,19 @@ build_stageref(struct gl_shader_program *shProg, const char *name) */ foreach_in_list(ir_instruction, node, sh->ir) { ir_variable *var = node->as_variable(); - if (var && strcmp(var->name, name) == 0) { -stages |= (1 << i); -break; + if (var) { +unsigned baselen = strlen(var->name); +if (strncmp(var->name, name, baselen) == 0) { + /* Check for exact name matches but also check for arrays and +* structs. +*/ + if (name[baselen] == '\0' || + name[baselen] == '[' || + name[baselen] == '.') { + stages |= (1 << i); + break; + } +} } } } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 1/4] glsl: check for leading zeros in array index validation
I didn't notice earlier but it seems this series also fixes some CTS subtests for example some test are fixed in ES31-CTS.program_interface_query.uniform -types. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev