[Mesa-dev] [PATCH 2/7] draw: add stream member to stats callback
This just adds space for the member to the callback, doesn't change anything else. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/draw/draw_pt_so_emit.c | 2 +- src/gallium/auxiliary/draw/draw_vbuf.h | 1 + src/gallium/drivers/llvmpipe/lp_setup_vbuf.c | 2 +- src/gallium/drivers/softpipe/sp_prim_vbuf.c | 2 +- 4 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pt_so_emit.c b/src/gallium/auxiliary/draw/draw_pt_so_emit.c index 91e67c0..581e2d6 100644 --- a/src/gallium/auxiliary/draw/draw_pt_so_emit.c +++ b/src/gallium/auxiliary/draw/draw_pt_so_emit.c @@ -296,7 +296,7 @@ void draw_pt_so_emit( struct pt_so_emit *emit, } } - render->set_stream_output_info(render, + render->set_stream_output_info(render, 0, emit->emitted_primitives, emit->generated_primitives); } diff --git a/src/gallium/auxiliary/draw/draw_vbuf.h b/src/gallium/auxiliary/draw/draw_vbuf.h index 194796b..137e3e5 100644 --- a/src/gallium/auxiliary/draw/draw_vbuf.h +++ b/src/gallium/auxiliary/draw/draw_vbuf.h @@ -124,6 +124,7 @@ struct vbuf_render { * Called after writing data to the stream out buffers */ void (*set_stream_output_info)( struct vbuf_render *vbufr, + unsigned stream, unsigned primitive_count, unsigned primitive_generated ); diff --git a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c index 8999200..d70c9be 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c +++ b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c @@ -544,7 +544,7 @@ lp_setup_vbuf_destroy(struct vbuf_render *vbr) * increase too should call this from outside streamout code. */ static void -lp_setup_so_info(struct vbuf_render *vbr, uint primitives, uint prim_generated) +lp_setup_so_info(struct vbuf_render *vbr, uint stream, uint primitives, uint prim_generated) { struct lp_setup_context *setup = lp_setup_context(vbr); struct llvmpipe_context *lp = llvmpipe_context(setup->pipe); diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c b/src/gallium/drivers/softpipe/sp_prim_vbuf.c index 18eca61..5809fd5 100644 --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c @@ -597,7 +597,7 @@ sp_vbuf_draw_arrays(struct vbuf_render *vbr, uint start, uint nr) * increase too should call this from outside streamout code. */ static void -sp_vbuf_so_info(struct vbuf_render *vbr, uint primitives, uint prim_generated) +sp_vbuf_so_info(struct vbuf_render *vbr, uint stream, uint primitives, uint prim_generated) { struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr); struct softpipe_context *softpipe = cvbr->softpipe; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] draw/tgsi: implement geom shader invocation support.
From: Dave Airlie This is just for softpipe, llvmpipe won't work without some changes. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/draw/draw_gs.c | 47 +- src/gallium/auxiliary/draw/draw_gs.h | 2 ++ src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++ src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 + 4 files changed, 34 insertions(+), 18 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 6375d41..755e527 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -190,9 +190,15 @@ static void tgsi_gs_prepare(struct draw_geometry_shader *shader, const unsigned constants_size[PIPE_MAX_CONSTANT_BUFFERS]) { struct tgsi_exec_machine *machine = shader->machine; - + int j; tgsi_exec_set_constant_buffers(machine, PIPE_MAX_CONSTANT_BUFFERS, constants, constants_size); + + if (shader->info.uses_invocationid) { + unsigned i = machine->SysSemanticToIndex[TGSI_SEMANTIC_INVOCATIONID]; + for (j = 0; j < TGSI_QUAD_SIZE; j++) + machine->SystemValue[i].i[j] = shader->invocation_id; + } } static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, @@ -555,7 +561,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, * overflown vertices into some area where they won't harm anyone */ unsigned total_verts_per_buffer = shader->primitive_boundary * num_in_primitives; - + unsigned invocation; //Assume at least one primitive max_out_prims = MAX2(max_out_prims, 1); @@ -564,7 +570,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, output_verts->stride = output_verts->vertex_size; output_verts->verts = (struct vertex_header *)MALLOC(output_verts->vertex_size * - total_verts_per_buffer); + total_verts_per_buffer * shader->num_invocations); debug_assert(output_verts->verts); #if 0 @@ -592,7 +598,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, shader->input = input; shader->input_info = input_info; FREE(shader->primitive_lengths); - shader->primitive_lengths = MALLOC(max_out_prims * sizeof(unsigned)); + shader->primitive_lengths = MALLOC(max_out_prims * sizeof(unsigned) * shader->num_invocations); #ifdef HAVE_LLVM @@ -622,23 +628,26 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader, } #endif - shader->prepare(shader, constants, constants_size); + for (invocation = 0; invocation < shader->num_invocations; invocation++) { + shader->invocation_id = invocation; - if (input_prim->linear) - gs_run(shader, input_prim, input_verts, - output_prims, output_verts); - else - gs_run_elts(shader, input_prim, input_verts, - output_prims, output_verts); + shader->prepare(shader, constants, constants_size); - /* Flush the remaining primitives. Will happen if -* num_input_primitives % 4 != 0 -*/ - if (shader->fetched_prim_count > 0) { - gs_flush(shader); - } + if (input_prim->linear) + gs_run(shader, input_prim, input_verts, +output_prims, output_verts); + else + gs_run_elts(shader, input_prim, input_verts, + output_prims, output_verts); - debug_assert(shader->fetched_prim_count == 0); + /* Flush the remaining primitives. Will happen if + * num_input_primitives % 4 != 0 + */ + if (shader->fetched_prim_count > 0) { + gs_flush(shader); + } + debug_assert(shader->fetched_prim_count == 0); + } /* Update prim_info: */ @@ -771,6 +780,8 @@ draw_create_geometry_shader(struct draw_context *draw, gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM]; gs->max_output_vertices = gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES]; + gs->num_invocations = + gs->info.properties[TGSI_PROPERTY_GS_INVOCATIONS]; if (!gs->max_output_vertices) gs->max_output_vertices = 32; diff --git a/src/gallium/auxiliary/draw/draw_gs.h b/src/gallium/auxiliary/draw/draw_gs.h index 49e93d5..663ba84 100644 --- a/src/gallium/auxiliary/draw/draw_gs.h +++ b/src/gallium/auxiliary/draw/draw_gs.h @@ -90,6 +90,8 @@ struct draw_geometry_shader { unsigned vector_length; unsigned max_out_prims; + unsigned num_invocations; + unsigned invocation_id; #ifdef HAVE_LLVM struct draw_gs_inputs *gs_input; struct draw_gs_jit_context *jit_context; diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c b/src/gallium/auxiliary/tgsi/tgsi_scan.c index d821072..e0506f4 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c @@ -230,6 +230,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, } else if (sem
[Mesa-dev] [PATCH 4/7] softpipe: add support for indexed queries.
We need indexed queries to retrieve the geom shader info. Signed-off-by: Dave Airlie --- src/gallium/drivers/softpipe/sp_context.h | 2 +- src/gallium/drivers/softpipe/sp_prim_vbuf.c | 4 ++-- src/gallium/drivers/softpipe/sp_query.c | 23 --- src/gallium/include/pipe/p_state.h | 2 +- 4 files changed, 16 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_context.h b/src/gallium/drivers/softpipe/sp_context.h index 50a7336..fa91854 100644 --- a/src/gallium/drivers/softpipe/sp_context.h +++ b/src/gallium/drivers/softpipe/sp_context.h @@ -91,7 +91,7 @@ struct softpipe_context { struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS]; unsigned num_so_targets; - struct pipe_query_data_so_statistics so_stats; + struct pipe_query_data_so_statistics so_stats[PIPE_MAX_VERTEX_STREAMS]; struct pipe_query_data_pipeline_statistics pipeline_statistics; unsigned active_statistics_queries; diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c b/src/gallium/drivers/softpipe/sp_prim_vbuf.c index 5809fd5..6c16d9c 100644 --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c @@ -602,8 +602,8 @@ sp_vbuf_so_info(struct vbuf_render *vbr, uint stream, uint primitives, uint prim struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr); struct softpipe_context *softpipe = cvbr->softpipe; - softpipe->so_stats.num_primitives_written += primitives; - softpipe->so_stats.primitives_storage_needed += prim_generated; + softpipe->so_stats[stream].num_primitives_written += primitives; + softpipe->so_stats[stream].primitives_storage_needed += prim_generated; } static void diff --git a/src/gallium/drivers/softpipe/sp_query.c b/src/gallium/drivers/softpipe/sp_query.c index e773870..0707219 100644 --- a/src/gallium/drivers/softpipe/sp_query.c +++ b/src/gallium/drivers/softpipe/sp_query.c @@ -39,6 +39,7 @@ struct softpipe_query { unsigned type; + unsigned index; uint64_t start; uint64_t end; struct pipe_query_data_so_statistics so; @@ -71,7 +72,7 @@ softpipe_create_query(struct pipe_context *pipe, type == PIPE_QUERY_TIMESTAMP_DISJOINT); sq = CALLOC_STRUCT( softpipe_query ); sq->type = type; - + sq->index = index; return (struct pipe_query *)sq; } @@ -98,17 +99,17 @@ softpipe_begin_query(struct pipe_context *pipe, struct pipe_query *q) sq->start = os_time_get_nano(); break; case PIPE_QUERY_SO_STATISTICS: - sq->so.num_primitives_written = softpipe->so_stats.num_primitives_written; - sq->so.primitives_storage_needed = softpipe->so_stats.primitives_storage_needed; + sq->so.num_primitives_written = softpipe->so_stats[0].num_primitives_written; + sq->so.primitives_storage_needed = softpipe->so_stats[0].primitives_storage_needed; break; case PIPE_QUERY_SO_OVERFLOW_PREDICATE: sq->end = FALSE; break; case PIPE_QUERY_PRIMITIVES_EMITTED: - sq->so.num_primitives_written = softpipe->so_stats.num_primitives_written; + sq->so.num_primitives_written = softpipe->so_stats[sq->index].num_primitives_written; break; case PIPE_QUERY_PRIMITIVES_GENERATED: - sq->so.primitives_storage_needed = softpipe->so_stats.primitives_storage_needed; + sq->so.primitives_storage_needed = softpipe->so_stats[sq->index].primitives_storage_needed; break; case PIPE_QUERY_TIMESTAMP: case PIPE_QUERY_GPU_FINISHED: @@ -154,24 +155,24 @@ softpipe_end_query(struct pipe_context *pipe, struct pipe_query *q) break; case PIPE_QUERY_SO_OVERFLOW_PREDICATE: sq->so.num_primitives_written = - softpipe->so_stats.num_primitives_written - sq->so.num_primitives_written; + softpipe->so_stats[0].num_primitives_written - sq->so.num_primitives_written; sq->so.primitives_storage_needed = - softpipe->so_stats.primitives_storage_needed - sq->so.primitives_storage_needed; + softpipe->so_stats[0].primitives_storage_needed - sq->so.primitives_storage_needed; sq->end = sq->so.primitives_storage_needed > sq->so.num_primitives_written; break; case PIPE_QUERY_SO_STATISTICS: sq->so.num_primitives_written = - softpipe->so_stats.num_primitives_written - sq->so.num_primitives_written; + softpipe->so_stats[sq->index].num_primitives_written - sq->so.num_primitives_written; sq->so.primitives_storage_needed = - softpipe->so_stats.primitives_storage_needed - sq->so.primitives_storage_needed; + softpipe->so_stats[sq->index].primitives_storage_needed - sq->so.primitives_storage_needed; break; case PIPE_QUERY_PRIMITIVES_EMITTED: sq->so.num_primitives_written = - softpipe->so_stats.num_primitives_written - sq->so.num_primitives_written; + softpipe->so_stats[sq->index].num_primitives_written - sq->so.num_primitives_written; bre
[Mesa-dev] [PATCH 6/7] softpipe: add support for vertex streams
This enables the ARB_gpu_shader5 vertex streams on softpipe. Signed-off-by: Dave Airlie --- src/gallium/drivers/softpipe/sp_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/softpipe/sp_screen.c b/src/gallium/drivers/softpipe/sp_screen.c index a688d31..76b3e95 100644 --- a/src/gallium/drivers/softpipe/sp_screen.c +++ b/src/gallium/drivers/softpipe/sp_screen.c @@ -122,7 +122,7 @@ softpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS: return 1024; case PIPE_CAP_MAX_VERTEX_STREAMS: - return 1; + return PIPE_MAX_VERTEX_STREAMS; case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: return 2048; case PIPE_CAP_PRIMITIVE_RESTART: -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC] softpipe/draw: gpu_shader5 geometry shader features.
Hey, First posting of these, Mainly looking for feedback on the approach, or any pitfalls anyone might notice, they pass most of the tests, need to schedule piglit runs for llvmpipe to make sure I haven't broken anything. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] docs/GL3.txt: update softpipe arb_gpu_shader5 status.
Signed-off-by: Dave Airlie --- docs/GL3.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 8e1c8cd..9fbde36 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -99,13 +99,13 @@ GL 4.0, GLSL 4.00: GL_ARB_gpu_shader5 DONE (i965, nvc0) - 'precise' qualifierDONE - Dynamically uniform sampler array indices DONE (r600) - - Dynamically uniform UBO array indices DONE (r600) + - Dynamically uniform UBO array indices DONE (r600, softpipe) - Implicit signed -> unsigned conversionsDONE - Fused multiply-add DONE () - Packing/bitfield/conversion functions DONE (r600, radeonsi, softpipe) - Enhanced textureGather DONE (r600, radeonsi, softpipe) - - Geometry shader instancing DONE (r600) - - Geometry shader multiple streams DONE () + - Geometry shader instancing DONE (r600, softpipe) + - Geometry shader multiple streams DONE (softpipe) - Enhanced per-sample shadingDONE (r600, radeonsi) - Interpolation functionsDONE (r600) - New overload resolution rules DONE -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] draw: add support to tgsi paths for geometry streams.
This hooks up the geometry shader processing to the TGSI support added in the previous commits. It doesn't change the llvm interface other than to keep things building. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/draw/draw_gs.c | 195 + src/gallium/auxiliary/draw/draw_gs.h | 21 ++- src/gallium/auxiliary/draw/draw_pt.h | 1 + .../auxiliary/draw/draw_pt_fetch_shade_pipeline.c | 16 +- .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 14 +- src/gallium/auxiliary/draw/draw_pt_so_emit.c | 64 --- 6 files changed, 192 insertions(+), 119 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 755e527..9798518 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -75,6 +75,7 @@ draw_gs_should_flush(struct draw_geometry_shader *shader) /*#define DEBUG_OUTPUTS 1*/ static void tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, + unsigned stream, unsigned num_primitives, float (**p_output)[4]) { @@ -89,14 +90,16 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, */ for (prim_idx = 0; prim_idx < num_primitives; ++prim_idx) { - unsigned num_verts_per_prim = machine->Primitives[prim_idx]; - shader->primitive_lengths[prim_idx + shader->emitted_primitives] = - machine->Primitives[prim_idx]; - shader->emitted_vertices += num_verts_per_prim; + unsigned num_verts_per_prim = machine->Primitives[stream][prim_idx]; + + shader->stream[stream].primitive_lengths[prim_idx + shader->stream[stream].emitted_primitives] = + machine->Primitives[stream][prim_idx]; + shader->stream[stream].emitted_vertices += num_verts_per_prim; + for (j = 0; j < num_verts_per_prim; j++, current_idx++) { - int idx = current_idx * shader->info.num_outputs; + int idx = machine->PrimitiveOffsets[stream][prim_idx] + current_idx * shader->info.num_outputs; #ifdef DEBUG_OUTPUTS - debug_printf("%d) Output vert:\n", idx / shader->info.num_outputs); + debug_printf("%d/%d) Output vert:\n", stream, idx / shader->info.num_outputs); #endif for (slot = 0; slot < shader->info.num_outputs; slot++) { output[slot][0] = machine->Outputs[idx + slot].xyzw[0].f[0]; @@ -115,7 +118,7 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, } } *p_output = output; - shader->emitted_primitives += num_primitives; + shader->stream[stream].emitted_primitives += num_primitives; } /*#define DEBUG_INPUTS 1*/ @@ -201,11 +204,12 @@ static void tgsi_gs_prepare(struct draw_geometry_shader *shader, } } -static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, -unsigned input_primitives) +static void tgsi_gs_run(struct draw_geometry_shader *shader, +unsigned input_primitives, +unsigned *out_prims) { struct tgsi_exec_machine *machine = shader->machine; - + int i; tgsi_set_exec_mask(machine, 1, input_primitives > 1, @@ -215,8 +219,30 @@ static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, /* run interpreter */ tgsi_exec_machine_run(machine); - return - machine->Temps[TGSI_EXEC_TEMP_PRIMITIVE_I].xyzw[TGSI_EXEC_TEMP_PRIMITIVE_C].u[0]; + for (i = 0; i < 4; i++) { + int prim_i; + int prim_c; + switch (i) { + case 0: + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_I; + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_C; + break; + case 1: + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S1_I; + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S1_C; + break; + case 2: + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S2_I; + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S2_C; + break; + case 3: + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S3_I; + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S3_C; + break; + }; + + out_prims[i] = machine->Temps[prim_i].xyzw[prim_c].u[0]; + } } #ifdef HAVE_LLVM @@ -293,6 +319,7 @@ llvm_fetch_gs_input(struct draw_geometry_shader *shader, static void llvm_fetch_gs_outputs(struct draw_geometry_shader *shader, + unsigned stream, unsigned num_primitives, float (**p_output)[4]) { @@ -313,7 +340,7 @@ llvm_fetch_gs_outputs(struct draw_geometry_shader *shader, total_verts += shader->llvm_emitted_vertices[i]; } - output_ptr += shader->emitted_vertices * shader->vertex_size; + output_ptr += shader->stream[0].emitted_vertices * shader->vertex_size; for (i = 0; i < shader->vector_length - 1; ++i) { int current_verts = shader->llvm_emitted_vertices[i]; int next_verts = shader->llvm_emitted_vertices[i + 1]; @@ -360,14 +38
[Mesa-dev] [PATCH 3/7] tgsi: add support for geometry shader streams.
This adds support to retrieve the primitive counts for each stream, along with the offset for each primitive into the output array. It also adds support for parsing the stream argument to the emit and end instructions. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 59 ++ src/gallium/auxiliary/tgsi/tgsi_exec.h | 14 ++-- 2 files changed, 58 insertions(+), 15 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index a098a82..f080385 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -706,7 +706,22 @@ enum tgsi_exec_datatype { #define TEMP_OUTPUT_C TGSI_EXEC_TEMP_OUTPUT_C #define TEMP_PRIMITIVE_I TGSI_EXEC_TEMP_PRIMITIVE_I #define TEMP_PRIMITIVE_C TGSI_EXEC_TEMP_PRIMITIVE_C - +#define TEMP_PRIMITIVE_S1_I TGSI_EXEC_TEMP_PRIMITIVE_S1_I +#define TEMP_PRIMITIVE_S1_C TGSI_EXEC_TEMP_PRIMITIVE_S1_C +#define TEMP_PRIMITIVE_S2_I TGSI_EXEC_TEMP_PRIMITIVE_S2_I +#define TEMP_PRIMITIVE_S2_C TGSI_EXEC_TEMP_PRIMITIVE_S2_C +#define TEMP_PRIMITIVE_S3_I TGSI_EXEC_TEMP_PRIMITIVE_S3_I +#define TEMP_PRIMITIVE_S3_C TGSI_EXEC_TEMP_PRIMITIVE_S3_C + +static const struct { + int idx; + int chan; +} temp_prim_idxs[] = { + { TEMP_PRIMITIVE_I, TEMP_PRIMITIVE_C }, + { TEMP_PRIMITIVE_S1_I, TEMP_PRIMITIVE_S1_C }, + { TEMP_PRIMITIVE_S2_I, TEMP_PRIMITIVE_S2_C }, + { TEMP_PRIMITIVE_S3_I, TEMP_PRIMITIVE_S3_C }, +}; /** The execution mask depends on the conditional mask and the loop mask */ #define UPDATE_EXEC_MASK(MACH) \ @@ -1848,35 +1863,51 @@ exec_kill(struct tgsi_exec_machine *mach, } static void -emit_vertex(struct tgsi_exec_machine *mach) +emit_vertex(struct tgsi_exec_machine *mach, +const struct tgsi_full_instruction *inst) { + union tgsi_exec_channel r[1]; + unsigned stream_id; + unsigned *prim_count; /* FIXME: check for exec mask correctly unsigned i; for (i = 0; i < TGSI_QUAD_SIZE; ++i) { if ((mach->ExecMask & (1 << i))) */ + IFETCH(&r[0], 0, TGSI_CHAN_X); + stream_id = r[0].u[0]; + prim_count = &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream_id].chan].u[0]; if (mach->ExecMask) { - if (mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]] >= mach->MaxOutputVertices) + if (mach->Primitives[stream_id][*prim_count] >= mach->MaxOutputVertices) return; + mach->PrimitiveOffsets[stream_id][*prim_count] = mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0]; mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0] += mach->NumOutputs; - mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]]++; + mach->Primitives[stream_id][*prim_count]++; } } static void -emit_primitive(struct tgsi_exec_machine *mach) +emit_primitive(struct tgsi_exec_machine *mach, + const struct tgsi_full_instruction *inst) { - unsigned *prim_count = &mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]; + unsigned *prim_count; + union tgsi_exec_channel r[1]; + unsigned stream_id = 0; /* FIXME: check for exec mask correctly unsigned i; for (i = 0; i < TGSI_QUAD_SIZE; ++i) { if ((mach->ExecMask & (1 << i))) */ + if (inst) { + IFETCH(&r[0], 0, TGSI_CHAN_X); + stream_id = r[0].u[0]; + } + prim_count = &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream_id].chan].u[0]; if (mach->ExecMask) { ++(*prim_count); debug_assert((*prim_count * mach->NumOutputs) < mach->MaxGeometryShaderOutputs); - mach->Primitives[*prim_count] = 0; + mach->Primitives[stream_id][*prim_count] = 0; } } @@ -1885,9 +1916,9 @@ conditional_emit_primitive(struct tgsi_exec_machine *mach) { if (TGSI_PROCESSOR_GEOMETRY == mach->Processor) { int emitted_verts = - mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]]; + mach->Primitives[0][mach->Temps[temp_prim_idxs[0].idx].xyzw[temp_prim_idxs[0].chan].u[0]]; if (emitted_verts) { - emit_primitive(mach); + emit_primitive(mach, NULL); } } } @@ -4596,11 +4627,11 @@ exec_instruction( break; case TGSI_OPCODE_EMIT: - emit_vertex(mach); + emit_vertex(mach, inst); break; case TGSI_OPCODE_ENDPRIM: - emit_primitive(mach); + emit_primitive(mach, inst); break; case TGSI_OPCODE_BGNLOOP: @@ -5061,8 +5092,10 @@ tgsi_exec_machine_run( struct tgsi_exec_machine *mach ) mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0] = 0; if( mach->Processor == TGSI_PROCESSOR_GEOMETRY ) { - mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0] = 0; - mach->Primitives[0] = 0; + for (i = 0; i < TGSI_MAX_VERTEX_STREAMS; i++) { + mach->Temps[temp_prim_idxs[i].idx].xyzw[temp_prim_idxs[i].chan].u[0] = 0; + mach-
[Mesa-dev] [PATCH] nir: Fix output swizzle in get_mul_for_src
When we compute the output swizzle we want to consider the writemask of the add operation, not the one from the multiplication. --- src/glsl/nir/nir_opt_peephole_ffma.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c b/src/glsl/nir/nir_opt_peephole_ffma.c index b430eac..c895c22 100644 --- a/src/glsl/nir/nir_opt_peephole_ffma.c +++ b/src/glsl/nir/nir_opt_peephole_ffma.c @@ -73,7 +73,8 @@ are_all_uses_fadd(nir_ssa_def *def) } static nir_alu_instr * -get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs) +get_mul_for_src(nir_alu_src *src, int writemask, +uint8_t swizzle[4], bool *negate, bool *abs) { assert(src->src.is_ssa && !src->abs && !src->negate); @@ -85,16 +86,16 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs) switch (alu->op) { case nir_op_imov: case nir_op_fmov: - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); break; case nir_op_fneg: - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); *negate = !*negate; break; case nir_op_fabs: - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); *negate = false; *abs = true; break; @@ -116,7 +117,7 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs) return NULL; for (unsigned i = 0; i < 4; i++) { - if (!(alu->dest.write_mask & (1 << i))) + if (!(writemask & (1 << i))) break; swizzle[i] = swizzle[src->swizzle[i]]; @@ -160,7 +161,8 @@ nir_opt_peephole_ffma_block(nir_block *block, void *void_state) negate = false; abs = false; - mul = get_mul_for_src(&add->src[add_mul_src], swizzle, &negate, &abs); + mul = get_mul_for_src(&add->src[add_mul_src], add->dest.write_mask, + swizzle, &negate, &abs); if (mul != NULL) break; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] tgsi: handle indirect sampler arrays.
This is required for ARB_gpu_shader5 support in softpipe. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f080385..a2f52f0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -2032,14 +2032,35 @@ exec_tex(struct tgsi_exec_machine *mach, const struct tgsi_full_instruction *inst, uint modifier, uint sampler) { - const uint unit = inst->Src[sampler].Register.Index; const union tgsi_exec_channel *args[5], *proj = NULL; union tgsi_exec_channel r[5]; enum tgsi_sampler_control control = tgsi_sampler_lod_none; uint chan; + uint unit; int8_t offsets[3]; int dim, shadow_ref, i; + + if (inst->Src[sampler].Register.Indirect) { + const struct tgsi_full_src_register *reg = &inst->Src[sampler]; + union tgsi_exec_channel indir_index, index2; + + index2.i[0] = + index2.i[1] = + index2.i[2] = + index2.i[3] = reg->Indirect.Index; + + fetch_src_file_channel(mach, + 0, + reg->Indirect.File, + reg->Indirect.Swizzle, + &index2, + &ZeroVec, + &indir_index); + unit = inst->Src[sampler].Register.Index + indir_index.i[0]; + } else { + unit = inst->Src[sampler].Register.Index; + } /* always fetch all 3 offsets, overkill but keeps code simple */ fetch_texel_offsets(mach, inst, offsets); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeon/llvm: don't use a static array size for radeon_llvm_context::arrays (v2)
On 27.05.2015 07:34, Marek Olšák wrote: > From: Marek Olšák > > v2: - don't use realloc (tgsi_shader_info provides the size) The series is Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 00/10] Implement extension ARB_framebuffer_no_attachments
This patch series implements: - the needed functionality in Mesa-core for ARB_framebuffer_no_attachments - implements and enables the extension i965 Kevin Rogovin (10): mesa: Define infrastructure for ARB_framebuffer_no_attachments mesa: Constants and functions for ARB_framebuffer_no_attachments mesa: Complete ARB_framebuffer_no_attachments in Mesa core mesa: add helper functions for geometry of gl_framebuffer mesa: helper function for scissor box of gl_framebuffer i965: Use _mesa_geometry_ functions appropriately mesa: mesa: function for testing if current frag-shader has atomics i965: execution of frag-shader when it has atomic buffer i965: enable ARB_framebuffer_no_attachments for Gen7+ mark GL_ARB_framebuffer_no_attachments as done for i965 docs/GL3.txt | 4 +- docs/relnotes/10.7.0.html | 4 +- .../glapi/gen/ARB_framebuffer_no_attachments.xml | 32 +++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 4 +- src/mesa/drivers/dri/i965/brw_clip_state.c | 9 +- src/mesa/drivers/dri/i965/brw_context.c| 6 + src/mesa/drivers/dri/i965/brw_misc_state.c | 9 +- src/mesa/drivers/dri/i965/brw_sf_state.c | 6 + src/mesa/drivers/dri/i965/brw_state_upload.c | 6 +- src/mesa/drivers/dri/i965/brw_wm.c | 7 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 +- src/mesa/drivers/dri/i965/gen6_clip_state.c| 10 +- src/mesa/drivers/dri/i965/gen6_multisample_state.c | 3 +- src/mesa/drivers/dri/i965/gen6_scissor_state.c | 12 +- src/mesa/drivers/dri/i965/gen6_sf_state.c | 3 +- src/mesa/drivers/dri/i965/gen6_viewport_state.c| 5 +- src/mesa/drivers/dri/i965/gen6_wm_state.c | 3 +- src/mesa/drivers/dri/i965/gen7_sf_state.c | 3 +- src/mesa/drivers/dri/i965/gen7_viewport_state.c| 5 +- src/mesa/drivers/dri/i965/gen7_wm_state.c | 7 +- src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 + src/mesa/drivers/dri/i965/gen8_viewport_state.c| 8 +- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/main/extensions.c | 1 + src/mesa/main/fbobject.c | 221 +++-- src/mesa/main/fbobject.h | 6 + src/mesa/main/framebuffer.c| 64 -- src/mesa/main/framebuffer.h| 33 +++ src/mesa/main/get.c| 1 + src/mesa/main/get_hash_params.py | 6 + src/mesa/main/mtypes.h | 61 +- src/mesa/main/tests/dispatch_sanity.cpp| 4 +- 33 files changed, 475 insertions(+), 85 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 02/10] mesa: Constants and functions for ARB_framebuffer_no_attachments
Define the enumeration constants, function entry points and glGet for the GL_ARB_framebuffer_no_attachments. v2: Add output="true" for GetFramebufferParameteriv parameter params. Alphabetical insertion. v3: Implement _mesa_GetFramebufferParameteriv and _mesa_FramebufferParameteri as always error. v4: Formatting fixes. Remove added documentation of how to add enums for glGet Signed-off-by: Kevin Rogovin --- .../glapi/gen/ARB_framebuffer_no_attachments.xml | 32 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 4 ++- src/mesa/main/fbobject.c | 28 +++ src/mesa/main/fbobject.h | 6 src/mesa/main/get.c| 1 + src/mesa/main/get_hash_params.py | 6 src/mesa/main/tests/dispatch_sanity.cpp| 4 +-- 8 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml diff --git a/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml b/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml new file mode 100644 index 000..10bdebc --- /dev/null +++ b/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml @@ -0,0 +1,32 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index adebd5c..5099f12 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -130,6 +130,7 @@ API_XML = \ ARB_draw_instanced.xml \ ARB_ES2_compatibility.xml \ ARB_ES3_compatibility.xml \ + ARB_framebuffer_no_attachments.xml \ ARB_framebuffer_object.xml \ ARB_geometry_shader4.xml \ ARB_get_program_binary.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 3090b9f..5079d30 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8188,7 +8188,9 @@ - +http://www.w3.org/2001/XInclude"/> + + diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 8fea7f8..4ac3f20 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1335,6 +1335,34 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint renderbuffer) bind_renderbuffer(target, renderbuffer, true); } +extern void GLAPIENTRY +_mesa_FramebufferParameteri(GLenum target, GLenum pname, GLint param) +{ + GET_CURRENT_CONTEXT(ctx); + + (void) target; + (void) pname; + (void) param; + + _mesa_error(ctx, GL_INVALID_OPERATION, + "glFramebufferParameteri not supported " + "(ARB_framebuffer_no_attachments not implemented)"); +} + +extern void GLAPIENTRY +_mesa_GetFramebufferParameteriv(GLenum target, GLenum pname, GLint *params) +{ + GET_CURRENT_CONTEXT(ctx); + + (void) target; + (void) pname; + (void) param; + + _mesa_error(ctx, GL_INVALID_OPERATION, + "glGetNamedFramebufferParameteriv not supported " + "(ARB_framebuffer_no_attachments not implemented)"); +} + /** * Remove the specified renderbuffer or texture from any attachment point in diff --git a/src/mesa/main/fbobject.h b/src/mesa/main/fbobject.h index 9f570db..8dad0ff 100644 --- a/src/mesa/main/fbobject.h +++ b/src/mesa/main/fbobject.h @@ -288,4 +288,10 @@ extern void GLAPIENTRY _mesa_DiscardFramebufferEXT(GLenum target, GLsizei numAttachments, const GLenum *attachments); +extern void GLAPIENTRY +_mesa_FramebufferParameteri(GLenum target, GLenum pname, GLint param); + +extern void GLAPIENTRY +_mesa_GetFramebufferParameteriv(GLenum target, GLenum pname, GLint *params); + #endif /* FBOBJECT_H */ diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 8a6c81a..6290096 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -393,6 +393,7 @@ EXTRA_EXT(INTEL_performance_query); EXTRA_EXT(ARB_explicit_uniform_location); EXTRA_EXT(ARB_clip_control); EXTRA_EXT(EXT_polygon_offset_clamp); +EXTRA_EXT(ARB_framebuffer_no_attachments); static const int extra_ARB_color_buffer_float_or_glcore[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 41cb2c1..4d30cee 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -798,6 +798,12 @@ descriptor=[ [ "MIN_FRAGMENT_INTERPOLATION_OFFSET", "CONTEXT_FLOAT(Const.MinFragmentInterpolationOffset), extra_ARB_gpu_shader5" ], [ "MAX_FRAGMENT_INTERPOLATION_OFFSET", "CONTEXT_FLOAT(Const.MaxFragmentInterpolationOffset), extra_ARB_gpu_shader5" ], [ "FRAGMENT_INTERPOLATION_OFFSET_BITS", "CONST(FRAGMENT_INTERPOLATION_OFFSET_BITS), extra_ARB_gpu_shader5" ], + +# GL_ARB_framebuffer_no_attachments + ["MAX_FRAMEBUFFER_WIDTH", "CONTEXT_INT(Const.MaxFramebufferWidth), extra_ARB_framebuffer_no_attachments"], + ["MAX_FRAMEBUFFER_HEIGH
[Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
Add helper convenience function that intersects the scissor values against a passed bounding box. In addition, to avoid replicated code, make the function _mesa_scissor_bounding_box() use this new function. v2: Split from patch "mesa:helper-conveniance functions for drivers to implement ARB_framebuffer_no_attachment". White space and long line fixes. v3: No changes. v4: No changes. Signed-off-by: Kevin Rogovin --- src/mesa/main/framebuffer.c | 63 +++-- src/mesa/main/framebuffer.h | 5 2 files changed, 48 insertions(+), 20 deletions(-) diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c index c2cfb92..dd9e4bc 100644 --- a/src/mesa/main/framebuffer.c +++ b/src/mesa/main/framebuffer.c @@ -357,42 +357,38 @@ update_framebuffer_size(struct gl_context *ctx, struct gl_framebuffer *fb) } + /** - * Calculate the inclusive bounding box for the scissor of a specific viewport + * Given a bounding box, intersect the bounding box with the scissor of + * a specified vieport. * * \param ctx GL context. - * \param buffer Framebuffer to be checked against * \param idx Index of the desired viewport * \param bboxBounding box for the scissored viewport. Stored as xmin, *xmax, ymin, ymax. - * - * \warning This function assumes that the framebuffer dimensions are up to - * date (e.g., update_framebuffer_size has been recently called on \c buffer). - * - * \sa _mesa_clip_to_region */ -void -_mesa_scissor_bounding_box(const struct gl_context *ctx, - const struct gl_framebuffer *buffer, - unsigned idx, int *bbox) +extern void +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx, + unsigned idx, int *bbox) { - bbox[0] = 0; - bbox[2] = 0; - bbox[1] = buffer->Width; - bbox[3] = buffer->Height; - if (ctx->Scissor.EnableFlags & (1u << idx)) { + int xmax, ymax; + + xmax = ctx->Scissor.ScissorArray[idx].X ++ ctx->Scissor.ScissorArray[idx].Width; + ymax = ctx->Scissor.ScissorArray[idx].Y ++ ctx->Scissor.ScissorArray[idx].Height; if (ctx->Scissor.ScissorArray[idx].X > bbox[0]) { bbox[0] = ctx->Scissor.ScissorArray[idx].X; } if (ctx->Scissor.ScissorArray[idx].Y > bbox[2]) { bbox[2] = ctx->Scissor.ScissorArray[idx].Y; } - if (ctx->Scissor.ScissorArray[idx].X + ctx->Scissor.ScissorArray[idx].Width < bbox[1]) { - bbox[1] = ctx->Scissor.ScissorArray[idx].X + ctx->Scissor.ScissorArray[idx].Width; + if (xmax < bbox[1]) { + bbox[1] = xmax; } - if (ctx->Scissor.ScissorArray[idx].Y + ctx->Scissor.ScissorArray[idx].Height < bbox[3]) { - bbox[3] = ctx->Scissor.ScissorArray[idx].Y + ctx->Scissor.ScissorArray[idx].Height; + if (ymax < bbox[3]) { +bbox[3] = ymax; } /* finally, check for empty region */ if (bbox[0] > bbox[1]) { @@ -402,6 +398,33 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, bbox[2] = bbox[3]; } } +} + +/** + * Calculate the inclusive bounding box for the scissor of a specific viewport + * + * \param ctx GL context. + * \param buffer Framebuffer to be checked against + * \param idx Index of the desired viewport + * \param bboxBounding box for the scissored viewport. Stored as xmin, + *xmax, ymin, ymax. + * + * \warning This function assumes that the framebuffer dimensions are up to + * date (e.g., update_framebuffer_size has been recently called on \c buffer). + * + * \sa _mesa_clip_to_region + */ +void +_mesa_scissor_bounding_box(const struct gl_context *ctx, + const struct gl_framebuffer *buffer, + unsigned idx, int *bbox) +{ + bbox[0] = 0; + bbox[2] = 0; + bbox[1] = buffer->Width; + bbox[3] = buffer->Height; + + _mesa_intersect_scissor_bounding_box(ctx, idx, bbox); assert(bbox[0] <= bbox[1]); assert(bbox[2] <= bbox[3]); diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h index 8b2aa34..bb1f2ea 100644 --- a/src/mesa/main/framebuffer.h +++ b/src/mesa/main/framebuffer.h @@ -76,6 +76,10 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, const struct gl_framebuffer *buffer, unsigned idx, int *bbox); +extern void +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx, + unsigned idx, int *bbox); + static inline GLuint _mesa_geometric_width(const struct gl_framebuffer *buffer) { @@ -83,6 +87,7 @@ _mesa_geometric_width(const struct gl_framebuffer *buffer) buffer->Width : buffer->DefaultGeometry.Width; } + static inline GLuint _mesa_geometric_height(const struct gl_framebuffer *buffer) { -- 1.9.1 ___ mesa-dev mailing lis
[Mesa-dev] [v4 PATCH 04/10] mesa: add helper functions for geometry of gl_framebuffer
Add convenience helper functions for fetching geometry of gl_framebuffer that return the geometry of the gl_framebuffer instead of the geometry of the buffers of the gl_framebuffer when then the gl_framebuffer has no attachments. v2: Split from patch "mesa:helper-conveniance functions for drivers to implement ARB_framebuffer_no_attachment". v3: Add error check for functions of extension. Implement DSA functions dependent on extension. v4: Formatting fixes. Signed-off-by: Kevin Rogovin --- src/mesa/main/framebuffer.h | 28 src/mesa/main/mtypes.h | 8 +++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h index d02b86f..8b2aa34 100644 --- a/src/mesa/main/framebuffer.h +++ b/src/mesa/main/framebuffer.h @@ -76,6 +76,34 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, const struct gl_framebuffer *buffer, unsigned idx, int *bbox); +static inline GLuint +_mesa_geometric_width(const struct gl_framebuffer *buffer) +{ + return buffer->_HasAttachments ? + buffer->Width : buffer->DefaultGeometry.Width; +} + +static inline GLuint +_mesa_geometric_height(const struct gl_framebuffer *buffer) +{ + return buffer->_HasAttachments ? + buffer->Height : buffer->DefaultGeometry.Height; +} + +static inline GLuint +_mesa_geometric_samples(const struct gl_framebuffer *buffer) +{ + return buffer->_HasAttachments ? + buffer->Visual.samples : buffer->DefaultGeometry.NumSamples; +} + +static inline GLuint +_mesa_geometric_layers(const struct gl_framebuffer *buffer) +{ + return buffer->_HasAttachments ? + buffer->MaxNumLayers : buffer->DefaultGeometry.Layers; +} + extern void _mesa_update_draw_buffer_bounds(struct gl_context *ctx, struct gl_framebuffer *drawFb); diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 5abbc0a..08316dc 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3187,7 +3187,13 @@ struct gl_framebuffer * GL_ARB_framebuffer_no_attachments must check for the flag _HasAttachments * and if GL_FALSE, must then use the values in DefaultGeometry to initialize * its viewport, scissor and so on (in particular _Xmin, _Xmax, _Ymin and -* _Ymax do NOT take into account _HasAttachments being false) +* _Ymax do NOT take into account _HasAttachments being false). To get the +* geometry of the framebuffer, the helper functions +* _mesa_geometric_width(), +* _mesa_geometric_height(), +* _mesa_geometric_samples() and +* _mesa_geometric_layers() +* are available that check _HasAttachments. */ GLboolean _HasAttachments; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 08/10] i965: execution of frag-shader when it has atomic buffer
Ensure that the GPU spawns the fragment shader thread for those fragment shaders with atomic buffer access. v2: No change. v3: Use utility function _mesa_active_fragment_shader_has_atomic_ops(). v4: Formatting fixes. Reviewed-by: Tapani Pälli (v1) Signed-off-by: Kevin Rogovin --- src/mesa/drivers/dri/i965/gen7_wm_state.c | 4 src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +++ 2 files changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c index 1c47076..ea11ae8 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c @@ -77,6 +77,10 @@ upload_wm_state(struct brw_context *brw) dw1 |= GEN7_WM_KILL_ENABLE; } + if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx)) { + dw1 |= GEN7_WM_DISPATCH_ENABLE; + } + /* _NEW_BUFFERS | _NEW_COLOR */ if (brw_color_buffer_write_enabled(brw) || writes_depth || dw1 & GEN7_WM_KILL_ENABLE) { diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c index 85ad3b6..b9cae50 100644 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c @@ -58,6 +58,9 @@ gen8_upload_ps_extra(struct brw_context *brw, if (prog_data->uses_omask) dw1 |= GEN8_PSX_OMASK_TO_RENDER_TARGET; + if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx)) + dw1 |= GEN8_PSX_SHADER_HAS_UAV; + BEGIN_BATCH(2); OUT_BATCH(_3DSTATE_PS_EXTRA << 16 | (2 - 2)); OUT_BATCH(dw1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 09/10] i965: enable ARB_framebuffer_no_attachments for Gen7+
Enable GL_ARB_framebuffer_no_attachments in i965 for Gen7 and higher. v2: No changes. v3: intel_extensions.c: Alphabetize insertion. v4: No changes. Reviewed-by: Ian Romanick (v2) Signed-off-by: Kevin Rogovin --- src/mesa/drivers/dri/i965/brw_context.c | 6 ++ src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 2 files changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index ea56859..9401a4a 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -613,6 +613,12 @@ brw_initialize_context_constants(struct brw_context *brw) /* ARB_gpu_shader5 */ if (brw->gen >= 7) ctx->Const.MaxVertexStreams = MIN2(4, MAX_VERTEX_STREAMS); + + /* ARB_framebuffer_no_attachments */ + ctx->Const.MaxFramebufferWidth = ctx->Const.MaxViewportWidth; + ctx->Const.MaxFramebufferHeight = ctx->Const.MaxViewportHeight; + ctx->Const.MaxFramebufferLayers = ctx->Const.MaxArrayTextureLayers; + ctx->Const.MaxFramebufferSamples = max_samples; } static void diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 18b69a0..1f38b5a 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -326,6 +326,7 @@ intelInitExtensions(struct gl_context *ctx) if (brw->gen >= 7) { ctx->Extensions.ARB_conservative_depth = true; ctx->Extensions.ARB_derivative_control = true; + ctx->Extensions.ARB_framebuffer_no_attachments = true; ctx->Extensions.ARB_gpu_shader5 = true; ctx->Extensions.ARB_shader_atomic_counters = true; ctx->Extensions.ARB_texture_compression_bptc = true; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 03/10] mesa: Complete ARB_framebuffer_no_attachments in Mesa core
Implement GL_ARB_framebuffer_no_attachments in Mesa core - changes to conditions for framebuffer completenss - implement set/get functions for framebuffers for new functions in GL_ARB_framebuffer_no_attachments v2: Spacing and exceed 80 characters per line fixes. v3: Implement DSA functions of extension. v4: Formatting fixes. Add early return to functions when extension(s) not present. Signed-off-by: Kevin Rogovin --- src/mesa/main/fbobject.c | 220 --- 1 file changed, 187 insertions(+), 33 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 4ac3f20..f9858ad 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1156,14 +1156,48 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, } else if (att_layer_count > max_layer_count) { max_layer_count = att_layer_count; } + + /** + * The extension GL_ARB_framebuffer_no_attachments places additional + * requirement on each attachment. Those additional requirements are + * tighter that those of previous versions of GL. In interest of better + * compatibility, we will not enforce these restrictions. For the record + * those additional restrictions are quoted below: + * + * "The width and height of image are greater than zero and less than or + * equal to the values of the implementation-dependent limits + * MAX_FRAMEBUFFER_WIDTH and MAX_FRAMEBUFFER_HEIGHT, respectively." + * + * "If is a three-dimensional texture or a one- or two-dimensional + * array texture and the attachment is layered, the depth or layer count + * of the texture is less than or equal to the implementation-dependent + * limit MAX_FRAMEBUFFER_LAYERS." + * + * "If image has multiple samples, its sample count is less than or equal + * to the value of the implementation-dependent limit + * MAX_FRAMEBUFFER_SAMPLES." + * + * The same requirements are also in place for GL 4.5, + * Section 9.4.1 "Framebuffer Attachment Completeness", pg 310-311 + */ } fb->MaxNumLayers = max_layer_count; if (numImages == 0) { - fb->_Status = GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT_EXT; - fbo_incomplete(ctx, "no attachments", -1); - return; + fb->_HasAttachments = GL_FALSE; + + if (!ctx->Extensions.ARB_framebuffer_no_attachments) { + fb->_Status = GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT_EXT; + fbo_incomplete(ctx, "no attachments", -1); + return; + } + + if (fb->DefaultGeometry.Width == 0 || fb->DefaultGeometry.Height == 0) { + fb->_Status = GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT_EXT; + fbo_incomplete(ctx, "no attachments and default width or height is 0", -1); + return; + } } if (_mesa_is_desktop_gl(ctx) && !ctx->Extensions.ARB_ES2_compatibility) { @@ -1228,8 +1262,10 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, * renderbuffers/textures are different sizes, the framebuffer * width/height will be set to the smallest width/height. */ - fb->Width = minWidth; - fb->Height = minHeight; + if (numImages != 0) { + fb->Width = minWidth; + fb->Height = minHeight; + } /* finally, update the visual info for the framebuffer */ _mesa_update_framebuffer_visual(ctx, fb); @@ -1335,32 +1371,129 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint renderbuffer) bind_renderbuffer(target, renderbuffer, true); } -extern void GLAPIENTRY +static void +framebuffer_parameteri(struct gl_context *ctx, struct gl_framebuffer *fb, + GLenum pname, GLint param, const char *func) +{ + switch (pname) { + case GL_FRAMEBUFFER_DEFAULT_WIDTH: + if (param < 0 || param > ctx->Const.MaxFramebufferWidth) +_mesa_error(ctx, GL_INVALID_VALUE, "%s", func); + else + fb->DefaultGeometry.Width = param; + break; + case GL_FRAMEBUFFER_DEFAULT_HEIGHT: + if (param < 0 || param > ctx->Const.MaxFramebufferHeight) +_mesa_error(ctx, GL_INVALID_VALUE, "%s", func); + else + fb->DefaultGeometry.Height = param; + break; + case GL_FRAMEBUFFER_DEFAULT_LAYERS: + if (param < 0 || param > ctx->Const.MaxFramebufferLayers) +_mesa_error(ctx, GL_INVALID_VALUE, "%s", func); + else + fb->DefaultGeometry.Layers = param; + break; + case GL_FRAMEBUFFER_DEFAULT_SAMPLES: + if (param < 0 || param > ctx->Const.MaxFramebufferSamples) +_mesa_error(ctx, GL_INVALID_VALUE, "%s", func); + else +fb->DefaultGeometry.NumSamples = param; + break; + case GL_FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS: + fb->DefaultGeometry.FixedSampleLocations = param; + break; + default: + _mesa_error(ctx, GL_INVALID_ENUM, + "
[Mesa-dev] [v4 PATCH 01/10] mesa: Define infrastructure for ARB_framebuffer_no_attachments
Define the infrastructure for the extension GL_ARB_framebuffer_no_attachments: - extension table - additions to gl_framebuffer v2: Formatting fixes. v3: mtypes.h: Correct comment on _HasAttachments. v4: Formatting fixes. mtypes.h: Clarify comments Signed-off-by: Kevin Rogovin --- src/mesa/main/extensions.c | 1 + src/mesa/main/fbobject.c| 1 + src/mesa/main/framebuffer.c | 1 + src/mesa/main/mtypes.h | 48 - 4 files changed, 46 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index c82416a..3256b2c 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -117,6 +117,7 @@ static const struct extension extension_table[] = { { "GL_ARB_fragment_program",o(ARB_fragment_program), GLL,2002 }, { "GL_ARB_fragment_program_shadow", o(ARB_fragment_program_shadow), GLL,2003 }, { "GL_ARB_fragment_shader", o(ARB_fragment_shader), GL, 2002 }, + { "GL_ARB_framebuffer_no_attachments", o(ARB_framebuffer_no_attachments), GL, 2012 }, { "GL_ARB_framebuffer_object", o(ARB_framebuffer_object), GL, 2005 }, { "GL_ARB_framebuffer_sRGB",o(EXT_framebuffer_sRGB), GL, 1998 }, { "GL_ARB_get_program_binary", o(dummy_true), GL, 2010 }, diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 1859c27..8fea7f8 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -957,6 +957,7 @@ _mesa_test_framebuffer_completeness(struct gl_context *ctx, fb->Height = 0; fb->_AllColorBuffersFixedPoint = GL_TRUE; fb->_HasSNormOrFloatColorBuffer = GL_FALSE; + fb->_HasAttachments = GL_TRUE; /* Start at -2 to more easily loop over all attachment points. * -2: depth buffer diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c index 665a5ba..c2cfb92 100644 --- a/src/mesa/main/framebuffer.c +++ b/src/mesa/main/framebuffer.c @@ -157,6 +157,7 @@ _mesa_initialize_window_framebuffer(struct gl_framebuffer *fb, fb->_Status = GL_FRAMEBUFFER_COMPLETE_EXT; fb->_AllColorBuffersFixedPoint = !visual->floatMode; fb->_HasSNormOrFloatColorBuffer = visual->floatMode; + fb->_HasAttachments = GL_TRUE; compute_depth_max(fb); } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 8342517..5abbc0a 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3146,12 +3146,29 @@ struct gl_framebuffer */ struct gl_config Visual; - GLuint Width, Height; /**< size of frame buffer in pixels */ + /** +* Size of frame buffer in pixels. If there are no attachments, then both +* of these are 0. +*/ + GLuint Width, Height; + + /** +* In the case that the framebuffer has no attachment (i.e. +* GL_ARB_framebuffer_no_attachments) then the geometry of +* the framebuffer is specified by the default values. +*/ + struct { + GLuint Width, Height, Layers, NumSamples; + GLboolean FixedSampleLocations; + } DefaultGeometry; - /** \name Drawing bounds (Intersection of buffer size and scissor box) */ + /** \name Drawing bounds (Intersection of buffer size and scissor box) +* The drawing region is given by [_Xmin, _Xmax) x [_Ymin, _Ymax), +* (inclusive for _Xmin and _Ymin while exclusive for _Xmax and _Ymax) +*/ /*@{*/ - GLint _Xmin, _Xmax; /**< inclusive */ - GLint _Ymin, _Ymax; /**< exclusive */ + GLint _Xmin, _Xmax; + GLint _Ymin, _Ymax; /*@}*/ /** \name Derived Z buffer stuff */ @@ -3164,6 +3181,16 @@ struct gl_framebuffer /** One of the GL_FRAMEBUFFER_(IN)COMPLETE_* tokens */ GLenum _Status; + /** Whether one of Attachment has Type != GL_NONE +* NOTE: the values for Width and Height are set to 0 in case of having +* no attachments, a backend driver supporting the extension +* GL_ARB_framebuffer_no_attachments must check for the flag _HasAttachments +* and if GL_FALSE, must then use the values in DefaultGeometry to initialize +* its viewport, scissor and so on (in particular _Xmin, _Xmax, _Ymin and +* _Ymax do NOT take into account _HasAttachments being false) +*/ + GLboolean _HasAttachments; + /** Integer color values */ GLboolean _IntegerColor; @@ -3174,7 +3201,9 @@ struct gl_framebuffer /** * The maximum number of layers in the framebuffer, or 0 if the framebuffer * is not layered. For cube maps and cube map arrays, each cube face -* counts as a layer. +* counts as a layer. As the case for Width, Height a backend driver +* supporting GL_ARB_framebuffer_no_attachments must use DefaultGeometry +* in the case t
[Mesa-dev] [v4 PATCH 10/10] mark GL_ARB_framebuffer_no_attachments as done for i965
Mark GL_ARB_framebuffer_no_attachments as done for i965. v2: File added to patch series v3: docs/GL3.txt : add done mark under GLES3.1 docs/relnotes/10.6.0.html : maintain alphabetical order v4: Move mark from 10.6.0.html to 10.7.0.html. Reviewed-by: Ian Romanick (v2) Signed-off-by: Kevin Rogovin --- docs/GL3.txt | 4 ++-- docs/relnotes/10.7.0.html | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 9d56ee5..fae8253 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -157,7 +157,7 @@ GL 4.3, GLSL 4.30: GL_KHR_debug DONE (all drivers) GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL) GL_ARB_fragment_layer_viewport DONE (nv50, nvc0, r600, llvmpipe) - GL_ARB_framebuffer_no_attachmentsnot started + GL_ARB_framebuffer_no_attachmentsDONE (i965) GL_ARB_internalformat_query2 not started GL_ARB_invalidate_subdataDONE (all drivers) GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe) @@ -216,7 +216,7 @@ GLES3.1, GLSL ES 3.1 GL_ARB_compute_shaderin progress (jljusten) GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe) GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL) - GL_ARB_framebuffer_no_attachmentsnot started + GL_ARB_framebuffer_no_attachmentsDONE (i965) GL_ARB_program_interface_query DONE (all drivers) GL_ARB_shader_atomic_countersDONE (i965) GL_ARB_shader_image_load_store in progress (curro) diff --git a/docs/relnotes/10.7.0.html b/docs/relnotes/10.7.0.html index 6206716..fbf07df 100644 --- a/docs/relnotes/10.7.0.html +++ b/docs/relnotes/10.7.0.html @@ -43,7 +43,9 @@ TBD. Note: some of the new features are only available with certain drivers. -TBD. + +GL_ARB_framebuffer_no_attachments on i965 + Bug fixes -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 07/10] mesa: function for testing if current frag-shader has atomics
Add helper function that checks if current fragment shader active of gl_context has atomic buffer access. v3: Added in v3 of patch series. v4: Formatting fixes. Signed-off-by: Kevin Rogovin --- src/mesa/main/mtypes.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 08316dc..4edba00 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -4462,7 +4462,12 @@ enum _debug DEBUG_INCOMPLETE_FBO = (1 << 3) }; - +static inline bool +_mesa_active_fragment_shader_has_atomic_ops(const struct gl_context *ctx) +{ + return ctx->Shader._CurrentFragmentProgram != NULL && + ctx->Shader._CurrentFragmentProgram->NumAtomicBuffers > 0; +} #ifdef __cplusplus } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v4 PATCH 06/10] i965: Use _mesa_geometry_ functions appropriately
Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. v2: Remove changes that would only be active in Gen4/5. Type and casting changes for consistency and readability. v3: Updates for rebase against master. v4: Use unsigned int inplace of int and remove casts to int. Formatting fixes. Signed-off-by: Kevin Rogovin --- src/mesa/drivers/dri/i965/brw_clip_state.c | 9 ++--- src/mesa/drivers/dri/i965/brw_misc_state.c | 9 ++--- src/mesa/drivers/dri/i965/brw_sf_state.c | 6 ++ src/mesa/drivers/dri/i965/brw_state_upload.c | 6 -- src/mesa/drivers/dri/i965/brw_wm.c | 7 --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 +++- src/mesa/drivers/dri/i965/gen6_clip_state.c| 10 +++--- src/mesa/drivers/dri/i965/gen6_multisample_state.c | 3 ++- src/mesa/drivers/dri/i965/gen6_scissor_state.c | 12 +--- src/mesa/drivers/dri/i965/gen6_sf_state.c | 3 ++- src/mesa/drivers/dri/i965/gen6_viewport_state.c| 5 +++-- src/mesa/drivers/dri/i965/gen6_wm_state.c | 3 ++- src/mesa/drivers/dri/i965/gen7_sf_state.c | 3 ++- src/mesa/drivers/dri/i965/gen7_viewport_state.c| 5 +++-- src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++- src/mesa/drivers/dri/i965/gen8_viewport_state.c| 8 +--- 16 files changed, 70 insertions(+), 34 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c b/src/mesa/drivers/dri/i965/brw_clip_state.c index 3223834..dee74db 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_state.c +++ b/src/mesa/drivers/dri/i965/brw_clip_state.c @@ -32,6 +32,7 @@ #include "brw_context.h" #include "brw_state.h" #include "brw_defines.h" +#include "main/framebuffer.h" static void upload_clip_vp(struct brw_context *brw) @@ -59,7 +60,9 @@ brw_upload_clip_unit(struct brw_context *brw) struct brw_clip_unit_state *clip; /* _NEW_BUFFERS */ - struct gl_framebuffer *fb = ctx->DrawBuffer; + const struct gl_framebuffer *fb = ctx->DrawBuffer; + const float fb_width = (float)_mesa_geometric_width(fb); + const float fb_height = (float)_mesa_geometric_height(fb); upload_clip_vp(brw); @@ -127,8 +130,8 @@ brw_upload_clip_unit(struct brw_context *brw) /* enable guardband clipping if we can */ if (ctx->ViewportArray[0].X == 0 && ctx->ViewportArray[0].Y == 0 && - ctx->ViewportArray[0].Width == (float) fb->Width && - ctx->ViewportArray[0].Height == (float) fb->Height) + ctx->ViewportArray[0].Width == fb_width && + ctx->ViewportArray[0].Height == fb_height) { clip->clip5.guard_band_enable = 1; clip->clip6.clipper_viewport_state_ptr = diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 67a693b..5a4515b 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -39,6 +39,7 @@ #include "brw_state.h" #include "brw_defines.h" +#include "main/framebuffer.h" #include "main/fbobject.h" #include "main/glformats.h" @@ -46,12 +47,14 @@ static void upload_drawing_rect(struct brw_context *brw) { struct gl_context *ctx = &brw->ctx; + const struct gl_framebuffer *fb = ctx->DrawBuffer; + const unsigned int fb_width = _mesa_geometric_width(fb); + const unsigned int fb_height = _mesa_geometric_height(fb); BEGIN_BATCH(4); OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE << 16 | (4 - 2)); OUT_BATCH(0); /* xmin, ymin */ - OUT_BATCH(((ctx->DrawBuffer->Width - 1) & 0x) | - ((ctx->DrawBuffer->Height - 1) << 16)); + OUT_BATCH(((fb_width - 1) & 0x) | ((fb_height - 1) << 16)); OUT_BATCH(0); ADVANCE_BATCH(); } @@ -767,7 +770,7 @@ static void upload_polygon_stipple_offset(struct brw_context *brw) * works just fine, and there's no window system to worry about. */ if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) - OUT_BATCH((32 - (ctx->DrawBuffer->Height & 31)) & 31); + OUT_BATCH((32 - (_mesa_geometric_height(ctx->DrawBuffer) & 31)) & 31); else OUT_BATCH(0); ADVANCE_BATCH(); diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c b/src/mesa/drivers/dri/i965/brw_sf_state.c index 014b434..5d98922 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_state.c +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c @@ -52,6 +52,12 @@ static void upload_sf_vp(struct brw_context *brw) sizeof(*sfv), 32, &brw->sf.vp_offset); memset(sfv, 0, sizeof(*sfv)); + /* Accessing the fields Width and Height of gl_framebuffer to produce the +* values to program the viewport a
Re: [Mesa-dev] [PATCH 1/2] configure.ac: enable building GLES1 and GLES2 by default
Wow, I hadn't expected such a hateful comment on GLES1. Does anyone else want to convince me that GLES1 should burn in hell? Marek On Wed, May 27, 2015 at 7:01 AM, Dave Airlie wrote: > On 26 May 2015 at 22:08, Marek Olšák wrote: >> It's mainly for distributions. You can turn it off, but it won't make >> the build faster, because all code is shared. > > we (fedora/rhel) don't enable GLES1, and hope nobody else wants to either. > > gles2 is fine, leave gles1 die. > Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: enable building GLES1 and GLES2 by default
> Wow, I hadn't expected such a hateful comment on GLES1. > > Does anyone else want to convince me that GLES1 should burn in hell? So I dug around, commit 4c06853833996d990eb76b195ca5d6838c6f3d6b Author: Adam Jackson Date: Wed May 8 18:03:21 2013 -0400 Switch to Mesa master (pre 9.2) - Fix llvmpipe on big-endian and enable llvmpipe everywhere - Build vdpau drivers for r600/radeonsi/nouveau - Enable hardware floating-point texture support - Drop GLESv1, nothing's using it, let's not start So at least in Fedora 2 years ago, we realised there was no GLES1 users in the distro, and we didn't want to encourage any. I suppose some users might exist outside the classic Linux distro world. i.e. android, embedded land. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: fix binding validation for interface blocks
Cc: Kenneth Graunke --- It looks like this may never have actually worked, at least there is no way it could have for inteface block arrays :) No piglit regressions. src/glsl/ast_to_hir.cpp | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 0ab664e..861bbfe 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2041,10 +2041,10 @@ validate_matrix_layout_for_type(struct _mesa_glsl_parse_state *state, static bool validate_binding_qualifier(struct _mesa_glsl_parse_state *state, YYLTYPE *loc, - ir_variable *var, + const glsl_type *type, const ast_type_qualifier *qual) { - if (var->data.mode != ir_var_uniform) { + if (!qual->flags.q.uniform) { _mesa_glsl_error(loc, state, "the \"binding\" qualifier only applies to uniforms"); return false; @@ -2056,10 +2056,11 @@ validate_binding_qualifier(struct _mesa_glsl_parse_state *state, } const struct gl_context *const ctx = state->ctx; - unsigned elements = var->type->is_array() ? var->type->length : 1; + unsigned elements = type->is_array() ? type->length : 1; unsigned max_index = qual->binding + elements - 1; + const glsl_type *base_type = type->without_array(); - if (var->type->is_interface()) { + if (base_type->is_interface()) { /* UBOs. From page 60 of the GLSL 4.20 specification: * "If the binding point for any uniform block instance is less than zero, * or greater than or equal to the implementation-dependent maximum @@ -2077,8 +2078,7 @@ validate_binding_qualifier(struct _mesa_glsl_parse_state *state, ctx->Const.MaxUniformBufferBindings); return false; } - } else if (var->type->is_sampler() || - (var->type->is_array() && var->type->fields.array->is_sampler())) { + } else if (base_type->is_sampler()) { /* Samplers. From page 63 of the GLSL 4.20 specification: * "If the binding is less than zero, or greater than or equal to the * implementation-dependent maximum supported number of units, a @@ -2095,7 +2095,7 @@ validate_binding_qualifier(struct _mesa_glsl_parse_state *state, return false; } - } else if (var->type->contains_atomic()) { + } else if (base_type->contains_atomic()) { assert(ctx->Const.MaxAtomicBufferBindings <= MAX_COMBINED_ATOMIC_BUFFERS); if (unsigned(qual->binding) >= ctx->Const.MaxAtomicBufferBindings) { _mesa_glsl_error(loc, state, "layout(binding = %d) exceeds the " @@ -2651,7 +2651,7 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual, } if (qual->flags.q.explicit_binding && - validate_binding_qualifier(state, loc, var, qual)) { + validate_binding_qualifier(state, loc, var->type, qual)) { var->data.explicit_binding = true; var->data.binding = qual->binding; } @@ -5658,6 +5658,8 @@ ast_interface_block::hir(exec_list *instructions, num_variables, packing, this->block_name); + if (this->layout.flags.q.explicit_binding) + validate_binding_qualifier(state, &loc, block_type, &this->layout); if (!state->symbols->add_interface(block_type->name, block_type, var_mode)) { YYLTYPE loc = this->get_location(); @@ -5744,6 +5746,10 @@ ast_interface_block::hir(exec_list *instructions, const glsl_type *block_array_type = process_array_type(&loc, block_type, this->array_specifier, state); + if(this->layout.flags.q.explicit_binding) +validate_binding_qualifier(state, &loc, block_array_type, + &this->layout); + var = new(state) ir_variable(block_array_type, this->instance_name, var_mode); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 06/10] i965: Use _mesa_geometry_ functions appropriately
Just some trivia: On Wed, May 27, 2015 at 5:49 AM, Kevin Rogovin wrote: > Change references to gl_framebuffer::Width, Height, MaxNumLayers > and Visual::samples to use the _mesa_geometry_ convenience functions Both here and in the subject, I presume you meant _mesa_geometric_* ? > for those places where the geometry of the gl_framebuffer is needed > (in contrast to the geometry of the intersection of the attachments > of the gl_framebuffer). > > This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments > on Gen7 and higher in i965. > > v2: > Remove changes that would only be active in Gen4/5. > Type and casting changes for consistency and readability. > > v3: > Updates for rebase against master. > > v4: > Use unsigned int inplace of int and remove casts to int. > Formatting fixes. > > Signed-off-by: Kevin Rogovin > --- > diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c > b/src/mesa/drivers/dri/i965/brw_sf_state.c > index 014b434..5d98922 100644 > --- a/src/mesa/drivers/dri/i965/brw_sf_state.c > +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c > @@ -52,6 +52,12 @@ static void upload_sf_vp(struct brw_context *brw) > sizeof(*sfv), 32, &brw->sf.vp_offset); > memset(sfv, 0, sizeof(*sfv)); > > + /* Accessing the fields Width and Height of gl_framebuffer to produce the > +* values to program the viewport and scissor is fine as long as the > +* gl_framebuffer has atleast one attachment. at least > +*/ > + assert(ctx->DrawBuffer->_HasAttachments); > + > if (render_to_fbo) { >y_scale = 1.0; >y_bias = 0; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: handle indirect sampler arrays.
Am 27.05.2015 um 10:38 schrieb Dave Airlie: > This is required for ARB_gpu_shader5 support in softpipe. > > Signed-off-by: Dave Airlie > --- > src/gallium/auxiliary/tgsi/tgsi_exec.c | 23 ++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c > b/src/gallium/auxiliary/tgsi/tgsi_exec.c > index f080385..a2f52f0 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c > @@ -2032,14 +2032,35 @@ exec_tex(struct tgsi_exec_machine *mach, > const struct tgsi_full_instruction *inst, > uint modifier, uint sampler) > { > - const uint unit = inst->Src[sampler].Register.Index; > const union tgsi_exec_channel *args[5], *proj = NULL; > union tgsi_exec_channel r[5]; > enum tgsi_sampler_control control = tgsi_sampler_lod_none; > uint chan; > + uint unit; > int8_t offsets[3]; > int dim, shadow_ref, i; > > + > + if (inst->Src[sampler].Register.Indirect) { > + const struct tgsi_full_src_register *reg = &inst->Src[sampler]; > + union tgsi_exec_channel indir_index, index2; > + > + index2.i[0] = > + index2.i[1] = > + index2.i[2] = > + index2.i[3] = reg->Indirect.Index; > + > + fetch_src_file_channel(mach, > + 0, > + reg->Indirect.File, > + reg->Indirect.Swizzle, > + &index2, > + &ZeroVec, > + &indir_index); > + unit = inst->Src[sampler].Register.Index + indir_index.i[0]; > + } else { > + unit = inst->Src[sampler].Register.Index; whitespace? > + } > /* always fetch all 3 offsets, overkill but keeps code simple */ > fetch_texel_offsets(mach, inst, offsets); > > Reviewed-by: Roland Scheidegger I think though there's a lot more required to make it work in softpipe... Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
More trivia: On Wed, May 27, 2015 at 5:49 AM, Kevin Rogovin wrote: > Add helper convenience function that intersects the scissor values > against a passed bounding box. In addition, to avoid replicated code, > make the function _mesa_scissor_bounding_box() use this new function. > > v2: > Split from patch "mesa:helper-conveniance functions for drivers to implement > ARB_framebuffer_no_attachment". > White space and long line fixes. > > v3: > No changes. > > v4: > No changes. > > > Signed-off-by: Kevin Rogovin > --- > src/mesa/main/framebuffer.c | 63 > +++-- > src/mesa/main/framebuffer.h | 5 > 2 files changed, 48 insertions(+), 20 deletions(-) > > diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c > index c2cfb92..dd9e4bc 100644 > --- a/src/mesa/main/framebuffer.c > +++ b/src/mesa/main/framebuffer.c > @@ -357,42 +357,38 @@ update_framebuffer_size(struct gl_context *ctx, struct > gl_framebuffer *fb) > } > > > + > /** > - * Calculate the inclusive bounding box for the scissor of a specific > viewport > + * Given a bounding box, intersect the bounding box with the scissor of > + * a specified vieport. > * > * \param ctx GL context. > - * \param buffer Framebuffer to be checked against > * \param idx Index of the desired viewport > * \param bboxBounding box for the scissored viewport. Stored as xmin, > *xmax, ymin, ymax. > - * > - * \warning This function assumes that the framebuffer dimensions are up to > - * date (e.g., update_framebuffer_size has been recently called on \c > buffer). > - * > - * \sa _mesa_clip_to_region > */ > -void > -_mesa_scissor_bounding_box(const struct gl_context *ctx, > - const struct gl_framebuffer *buffer, > - unsigned idx, int *bbox) > +extern void > +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx, > + unsigned idx, int *bbox) > { > - bbox[0] = 0; > - bbox[2] = 0; > - bbox[1] = buffer->Width; > - bbox[3] = buffer->Height; > - > if (ctx->Scissor.EnableFlags & (1u << idx)) { > + int xmax, ymax; > + > + xmax = ctx->Scissor.ScissorArray[idx].X > ++ ctx->Scissor.ScissorArray[idx].Width; 44 instances of a leading + in mesa/main compared to 78 trailing ones. Huh, I was going to say that it's really uncommon to do this in mesa, but that may not be supported by fact. > + ymax = ctx->Scissor.ScissorArray[idx].Y > ++ ctx->Scissor.ScissorArray[idx].Height; >if (ctx->Scissor.ScissorArray[idx].X > bbox[0]) { > bbox[0] = ctx->Scissor.ScissorArray[idx].X; >} >if (ctx->Scissor.ScissorArray[idx].Y > bbox[2]) { > bbox[2] = ctx->Scissor.ScissorArray[idx].Y; >} > - if (ctx->Scissor.ScissorArray[idx].X + > ctx->Scissor.ScissorArray[idx].Width < bbox[1]) { > - bbox[1] = ctx->Scissor.ScissorArray[idx].X + > ctx->Scissor.ScissorArray[idx].Width; > + if (xmax < bbox[1]) { > + bbox[1] = xmax; >} > - if (ctx->Scissor.ScissorArray[idx].Y + > ctx->Scissor.ScissorArray[idx].Height < bbox[3]) { > - bbox[3] = ctx->Scissor.ScissorArray[idx].Y + > ctx->Scissor.ScissorArray[idx].Height; > + if (ymax < bbox[3]) { > +bbox[3] = ymax; indent looks messed up here. >} >/* finally, check for empty region */ >if (bbox[0] > bbox[1]) { > @@ -402,6 +398,33 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, > bbox[2] = bbox[3]; >} > } > +} > + > +/** > + * Calculate the inclusive bounding box for the scissor of a specific > viewport > + * > + * \param ctx GL context. > + * \param buffer Framebuffer to be checked against > + * \param idx Index of the desired viewport > + * \param bboxBounding box for the scissored viewport. Stored as xmin, > + *xmax, ymin, ymax. > + * > + * \warning This function assumes that the framebuffer dimensions are up to > + * date (e.g., update_framebuffer_size has been recently called on \c > buffer). > + * > + * \sa _mesa_clip_to_region > + */ > +void > +_mesa_scissor_bounding_box(const struct gl_context *ctx, > + const struct gl_framebuffer *buffer, > + unsigned idx, int *bbox) > +{ > + bbox[0] = 0; > + bbox[2] = 0; > + bbox[1] = buffer->Width; > + bbox[3] = buffer->Height; > + > + _mesa_intersect_scissor_bounding_box(ctx, idx, bbox); > > assert(bbox[0] <= bbox[1]); > assert(bbox[2] <= bbox[3]); > diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h > index 8b2aa34..bb1f2ea 100644 > --- a/src/mesa/main/framebuffer.h > +++ b/src/mesa/main/framebuffer.h > @@ -76,6 +76,10 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, > const struct gl_framebuffer *buffer, > unsigned idx, int
Re: [Mesa-dev] [v4 PATCH 04/10] mesa: add helper functions for geometry of gl_framebuffer
More trivia: On Wed, May 27, 2015 at 5:49 AM, Kevin Rogovin wrote: > Add convenience helper functions for fetching geometry of gl_framebuffer > that return the geometry of the gl_framebuffer instead of the geometry of > the buffers of the gl_framebuffer when then the gl_framebuffer has no > attachments. > > v2: > Split from patch "mesa:helper-conveniance functions for drivers to implement > ARB_framebuffer_no_attachment". > > v3: > Add error check for functions of extension. > Implement DSA functions dependent on extension. > > v4: > Formatting fixes. > > Signed-off-by: Kevin Rogovin > --- > src/mesa/main/framebuffer.h | 28 > src/mesa/main/mtypes.h | 8 +++- > 2 files changed, 35 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h > index d02b86f..8b2aa34 100644 > --- a/src/mesa/main/framebuffer.h > +++ b/src/mesa/main/framebuffer.h > @@ -76,6 +76,34 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx, > const struct gl_framebuffer *buffer, > unsigned idx, int *bbox); > > +static inline GLuint Here and below, why 2 spaces between "inline" and "GLuint"? > +_mesa_geometric_width(const struct gl_framebuffer *buffer) > +{ > + return buffer->_HasAttachments ? > + buffer->Width : buffer->DefaultGeometry.Width; > +} > + > +static inline GLuint > +_mesa_geometric_height(const struct gl_framebuffer *buffer) > +{ > + return buffer->_HasAttachments ? > + buffer->Height : buffer->DefaultGeometry.Height; > +} > + > +static inline GLuint > +_mesa_geometric_samples(const struct gl_framebuffer *buffer) > +{ > + return buffer->_HasAttachments ? > + buffer->Visual.samples : buffer->DefaultGeometry.NumSamples; > +} > + > +static inline GLuint > +_mesa_geometric_layers(const struct gl_framebuffer *buffer) > +{ > + return buffer->_HasAttachments ? > + buffer->MaxNumLayers : buffer->DefaultGeometry.Layers; > +} > + > extern void > _mesa_update_draw_buffer_bounds(struct gl_context *ctx, > struct gl_framebuffer *drawFb); > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 5abbc0a..08316dc 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -3187,7 +3187,13 @@ struct gl_framebuffer > * GL_ARB_framebuffer_no_attachments must check for the flag > _HasAttachments > * and if GL_FALSE, must then use the values in DefaultGeometry to > initialize > * its viewport, scissor and so on (in particular _Xmin, _Xmax, _Ymin and > -* _Ymax do NOT take into account _HasAttachments being false) > +* _Ymax do NOT take into account _HasAttachments being false). To get the > +* geometry of the framebuffer, the helper functions Why 2 spaces between "the" and "helper"? > +* _mesa_geometric_width(), > +* _mesa_geometric_height(), > +* _mesa_geometric_samples() and > +* _mesa_geometric_layers() > +* are available that check _HasAttachments. > */ > GLboolean _HasAttachments; > > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] docs/GL3.txt: update softpipe arb_gpu_shader5 status.
On Wed, May 27, 2015 at 3:46 AM, Dave Airlie wrote: > Signed-off-by: Dave Airlie > --- > docs/GL3.txt | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/docs/GL3.txt b/docs/GL3.txt > index 8e1c8cd..9fbde36 100644 > --- a/docs/GL3.txt > +++ b/docs/GL3.txt > @@ -99,13 +99,13 @@ GL 4.0, GLSL 4.00: >GL_ARB_gpu_shader5 DONE (i965, nvc0) >- 'precise' qualifierDONE >- Dynamically uniform sampler array indices DONE (r600) > - - Dynamically uniform UBO array indices DONE (r600) > + - Dynamically uniform UBO array indices DONE (r600, softpipe) I didn't see patches for this.. does it already Just Work? >- Implicit signed -> unsigned conversionsDONE >- Fused multiply-add DONE () >- Packing/bitfield/conversion functions DONE (r600, radeonsi, > softpipe) >- Enhanced textureGather DONE (r600, radeonsi, > softpipe) > - - Geometry shader instancing DONE (r600) > - - Geometry shader multiple streams DONE () > + - Geometry shader instancing DONE (r600, softpipe) > + - Geometry shader multiple streams DONE (softpipe) >- Enhanced per-sample shadingDONE (r600, radeonsi) >- Interpolation functionsDONE (r600) >- New overload resolution rules DONE > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] draw/tgsi: implement geom shader invocation support.
Am 27.05.2015 um 09:45 schrieb Dave Airlie: > From: Dave Airlie > > This is just for softpipe, llvmpipe won't work without > some changes. > > Signed-off-by: Dave Airlie > --- > src/gallium/auxiliary/draw/draw_gs.c | 47 > +- > src/gallium/auxiliary/draw/draw_gs.h | 2 ++ > src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++ > src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 + > 4 files changed, 34 insertions(+), 18 deletions(-) > > diff --git a/src/gallium/auxiliary/draw/draw_gs.c > b/src/gallium/auxiliary/draw/draw_gs.c > index 6375d41..755e527 100644 > --- a/src/gallium/auxiliary/draw/draw_gs.c > +++ b/src/gallium/auxiliary/draw/draw_gs.c > @@ -190,9 +190,15 @@ static void tgsi_gs_prepare(struct draw_geometry_shader > *shader, > const unsigned > constants_size[PIPE_MAX_CONSTANT_BUFFERS]) > { > struct tgsi_exec_machine *machine = shader->machine; > - > + int j; > tgsi_exec_set_constant_buffers(machine, PIPE_MAX_CONSTANT_BUFFERS, >constants, constants_size); > + > + if (shader->info.uses_invocationid) { > + unsigned i = machine->SysSemanticToIndex[TGSI_SEMANTIC_INVOCATIONID]; > + for (j = 0; j < TGSI_QUAD_SIZE; j++) > + machine->SystemValue[i].i[j] = shader->invocation_id; > + } > } > > static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, > @@ -555,7 +561,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader > *shader, > * overflown vertices into some area where they won't harm anyone */ > unsigned total_verts_per_buffer = shader->primitive_boundary * >num_in_primitives; > - > + unsigned invocation; > //Assume at least one primitive > max_out_prims = MAX2(max_out_prims, 1); > > @@ -564,7 +570,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader > *shader, > output_verts->stride = output_verts->vertex_size; > output_verts->verts = >(struct vertex_header *)MALLOC(output_verts->vertex_size * > - total_verts_per_buffer); > + total_verts_per_buffer * > shader->num_invocations); This can't get very large, right? Otherwise, Reviewed-by: Roland Scheidegger > debug_assert(output_verts->verts); > > #if 0 > @@ -592,7 +598,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader > *shader, > shader->input = input; > shader->input_info = input_info; > FREE(shader->primitive_lengths); > - shader->primitive_lengths = MALLOC(max_out_prims * sizeof(unsigned)); > + shader->primitive_lengths = MALLOC(max_out_prims * sizeof(unsigned) * > shader->num_invocations); > > > #ifdef HAVE_LLVM > @@ -622,23 +628,26 @@ int draw_geometry_shader_run(struct > draw_geometry_shader *shader, > } > #endif > > - shader->prepare(shader, constants, constants_size); > + for (invocation = 0; invocation < shader->num_invocations; invocation++) { > + shader->invocation_id = invocation; > > - if (input_prim->linear) > - gs_run(shader, input_prim, input_verts, > - output_prims, output_verts); > - else > - gs_run_elts(shader, input_prim, input_verts, > - output_prims, output_verts); > + shader->prepare(shader, constants, constants_size); > > - /* Flush the remaining primitives. Will happen if > -* num_input_primitives % 4 != 0 > -*/ > - if (shader->fetched_prim_count > 0) { > - gs_flush(shader); > - } > + if (input_prim->linear) > + gs_run(shader, input_prim, input_verts, > +output_prims, output_verts); > + else > + gs_run_elts(shader, input_prim, input_verts, > + output_prims, output_verts); > > - debug_assert(shader->fetched_prim_count == 0); > + /* Flush the remaining primitives. Will happen if > + * num_input_primitives % 4 != 0 > + */ > + if (shader->fetched_prim_count > 0) { > + gs_flush(shader); > + } > + debug_assert(shader->fetched_prim_count == 0); > + } > > /* Update prim_info: > */ > @@ -771,6 +780,8 @@ draw_create_geometry_shader(struct draw_context *draw, > gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM]; > gs->max_output_vertices = > gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES]; > + gs->num_invocations = > + gs->info.properties[TGSI_PROPERTY_GS_INVOCATIONS]; > if (!gs->max_output_vertices) >gs->max_output_vertices = 32; > > diff --git a/src/gallium/auxiliary/draw/draw_gs.h > b/src/gallium/auxiliary/draw/draw_gs.h > index 49e93d5..663ba84 100644 > --- a/src/gallium/auxiliary/draw/draw_gs.h > +++ b/src/gallium/auxiliary/draw/draw_gs.h > @@ -90,6 +90,8 @@ struct draw_geometry_shader { > unsigned vector_length; > unsigned max_out_prims; > > + unsigned num_invocations; > + unsigned invocation_id; > #ifdef HAVE_LLVM > struct draw_gs_inpu
Re: [Mesa-dev] [PATCH 1/7] draw/tgsi: implement geom shader invocation support.
On Wed, May 27, 2015 at 7:42 AM, Roland Scheidegger wrote: > Am 27.05.2015 um 09:45 schrieb Dave Airlie: >> From: Dave Airlie >> >> This is just for softpipe, llvmpipe won't work without >> some changes. >> >> Signed-off-by: Dave Airlie >> --- >> src/gallium/auxiliary/draw/draw_gs.c | 47 >> +- >> src/gallium/auxiliary/draw/draw_gs.h | 2 ++ >> src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++ >> src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 + >> 4 files changed, 34 insertions(+), 18 deletions(-) >> >> diff --git a/src/gallium/auxiliary/draw/draw_gs.c >> b/src/gallium/auxiliary/draw/draw_gs.c >> index 6375d41..755e527 100644 >> --- a/src/gallium/auxiliary/draw/draw_gs.c >> +++ b/src/gallium/auxiliary/draw/draw_gs.c >> @@ -190,9 +190,15 @@ static void tgsi_gs_prepare(struct draw_geometry_shader >> *shader, >> const unsigned >> constants_size[PIPE_MAX_CONSTANT_BUFFERS]) >> { >> struct tgsi_exec_machine *machine = shader->machine; >> - >> + int j; >> tgsi_exec_set_constant_buffers(machine, PIPE_MAX_CONSTANT_BUFFERS, >>constants, constants_size); >> + >> + if (shader->info.uses_invocationid) { >> + unsigned i = machine->SysSemanticToIndex[TGSI_SEMANTIC_INVOCATIONID]; >> + for (j = 0; j < TGSI_QUAD_SIZE; j++) >> + machine->SystemValue[i].i[j] = shader->invocation_id; >> + } >> } >> >> static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, >> @@ -555,7 +561,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader >> *shader, >> * overflown vertices into some area where they won't harm anyone */ >> unsigned total_verts_per_buffer = shader->primitive_boundary * >>num_in_primitives; >> - >> + unsigned invocation; >> //Assume at least one primitive >> max_out_prims = MAX2(max_out_prims, 1); >> >> @@ -564,7 +570,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader >> *shader, >> output_verts->stride = output_verts->vertex_size; >> output_verts->verts = >>(struct vertex_header *)MALLOC(output_verts->vertex_size * >> - total_verts_per_buffer); >> + total_verts_per_buffer * >> shader->num_invocations); > This can't get very large, right? With ARB_gpu_shader5, it has to be at least 32. In mesa, it's always defined as 32. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: enable building GLES1 and GLES2 by default
On 27 May 2015 at 11:23, Dave Airlie wrote: >> Wow, I hadn't expected such a hateful comment on GLES1. >> >> Does anyone else want to convince me that GLES1 should burn in hell? > > So I dug around, > > commit 4c06853833996d990eb76b195ca5d6838c6f3d6b > Author: Adam Jackson > Date: Wed May 8 18:03:21 2013 -0400 > > Switch to Mesa master (pre 9.2) > > - Fix llvmpipe on big-endian and enable llvmpipe everywhere > - Build vdpau drivers for r600/radeonsi/nouveau > - Enable hardware floating-point texture support > - Drop GLESv1, nothing's using it, let's not start > > So at least in Fedora 2 years ago, we realised there was no GLES1 > users in the distro, > and we didn't want to encourage any. > > I suppose some users might exist outside the classic Linux distro > world. i.e. android, embedded land. > At least three other distros have GLES1 - Arch, Debian and OpenSUSE. So imho one should just leave the decision to depreciate/kill it off to the distros ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] draw/tgsi: implement geom shader invocation support.
Am 27.05.2015 um 13:47 schrieb Ilia Mirkin: > On Wed, May 27, 2015 at 7:42 AM, Roland Scheidegger > wrote: >> Am 27.05.2015 um 09:45 schrieb Dave Airlie: >>> From: Dave Airlie >>> >>> This is just for softpipe, llvmpipe won't work without >>> some changes. >>> >>> Signed-off-by: Dave Airlie >>> --- >>> src/gallium/auxiliary/draw/draw_gs.c | 47 >>> +- >>> src/gallium/auxiliary/draw/draw_gs.h | 2 ++ >>> src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++ >>> src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 + >>> 4 files changed, 34 insertions(+), 18 deletions(-) >>> >>> diff --git a/src/gallium/auxiliary/draw/draw_gs.c >>> b/src/gallium/auxiliary/draw/draw_gs.c >>> index 6375d41..755e527 100644 >>> --- a/src/gallium/auxiliary/draw/draw_gs.c >>> +++ b/src/gallium/auxiliary/draw/draw_gs.c >>> @@ -190,9 +190,15 @@ static void tgsi_gs_prepare(struct >>> draw_geometry_shader *shader, >>> const unsigned >>> constants_size[PIPE_MAX_CONSTANT_BUFFERS]) >>> { >>> struct tgsi_exec_machine *machine = shader->machine; >>> - >>> + int j; >>> tgsi_exec_set_constant_buffers(machine, PIPE_MAX_CONSTANT_BUFFERS, >>>constants, constants_size); >>> + >>> + if (shader->info.uses_invocationid) { >>> + unsigned i = machine->SysSemanticToIndex[TGSI_SEMANTIC_INVOCATIONID]; >>> + for (j = 0; j < TGSI_QUAD_SIZE; j++) >>> + machine->SystemValue[i].i[j] = shader->invocation_id; >>> + } >>> } >>> >>> static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, >>> @@ -555,7 +561,7 @@ int draw_geometry_shader_run(struct >>> draw_geometry_shader *shader, >>> * overflown vertices into some area where they won't harm anyone */ >>> unsigned total_verts_per_buffer = shader->primitive_boundary * >>>num_in_primitives; >>> - >>> + unsigned invocation; >>> //Assume at least one primitive >>> max_out_prims = MAX2(max_out_prims, 1); >>> >>> @@ -564,7 +570,7 @@ int draw_geometry_shader_run(struct >>> draw_geometry_shader *shader, >>> output_verts->stride = output_verts->vertex_size; >>> output_verts->verts = >>>(struct vertex_header *)MALLOC(output_verts->vertex_size * >>> - total_verts_per_buffer); >>> + total_verts_per_buffer * >>> shader->num_invocations); >> This can't get very large, right? > > With ARB_gpu_shader5, it has to be at least 32. In mesa, it's always > defined as 32. > I was more wondering if the buffer without the num_invocations mul was already defined somewhere to not exceed some limit. Probably should be ok though... Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: enable building GLES1 and GLES2 by default
The cost of shipping GLES1 is 27KB for libGLESv1_CM.so.1.1.0, so I honestly think there shouldn't even be an option to disable it. It's not something I particularly care about though. Marek On Wed, May 27, 2015 at 1:53 PM, Emil Velikov wrote: > On 27 May 2015 at 11:23, Dave Airlie wrote: >>> Wow, I hadn't expected such a hateful comment on GLES1. >>> >>> Does anyone else want to convince me that GLES1 should burn in hell? >> >> So I dug around, >> >> commit 4c06853833996d990eb76b195ca5d6838c6f3d6b >> Author: Adam Jackson >> Date: Wed May 8 18:03:21 2013 -0400 >> >> Switch to Mesa master (pre 9.2) >> >> - Fix llvmpipe on big-endian and enable llvmpipe everywhere >> - Build vdpau drivers for r600/radeonsi/nouveau >> - Enable hardware floating-point texture support >> - Drop GLESv1, nothing's using it, let's not start >> >> So at least in Fedora 2 years ago, we realised there was no GLES1 >> users in the distro, >> and we didn't want to encourage any. >> >> I suppose some users might exist outside the classic Linux distro >> world. i.e. android, embedded land. >> > At least three other distros have GLES1 - Arch, Debian and OpenSUSE. > So imho one should just leave the decision to depreciate/kill it off > to the distros ? > > -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/11] tgsi/scan: get more information about arrays and handle arrays correctly
Yes, that looks better to me. It may be a bit inconsistent to only do that for inputs/outputs and not for temp but at least this way there's no additional limit on the number of arrays. Roland Am 26.05.2015 um 13:04 schrieb Marek Olšák: > From: Marek Olšák > > v2: use less memory for the information > --- > src/gallium/auxiliary/tgsi/tgsi_scan.c | 24 +--- > src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 > 2 files changed, 25 insertions(+), 3 deletions(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c > b/src/gallium/auxiliary/tgsi/tgsi_scan.c > index d821072..369f56a 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c > @@ -167,13 +167,31 @@ tgsi_scan_shader(const struct tgsi_token *tokens, > = &parse.FullToken.FullDeclaration; > const uint file = fulldecl->Declaration.File; > uint reg; > -if (fulldecl->Declaration.Array) > - info->array_max[file] = MAX2(info->array_max[file], > fulldecl->Array.ArrayID); > + > +if (fulldecl->Declaration.Array) { > + unsigned array_id = fulldecl->Array.ArrayID; > + > + switch (file) { > + case TGSI_FILE_INPUT: > + assert(array_id < ARRAY_SIZE(info->input_array_first)); > + info->input_array_first[array_id] = fulldecl->Range.First; > + info->input_array_last[array_id] = fulldecl->Range.Last; > + break; > + case TGSI_FILE_OUTPUT: > + assert(array_id < ARRAY_SIZE(info->output_array_first)); > + info->output_array_first[array_id] = fulldecl->Range.First; > + info->output_array_last[array_id] = fulldecl->Range.Last; > + break; > + } > + info->array_max[file] = MAX2(info->array_max[file], array_id); > +} > + > for (reg = fulldecl->Range.First; > reg <= fulldecl->Range.Last; > reg++) { > unsigned semName = fulldecl->Semantic.Name; > - unsigned semIndex = fulldecl->Semantic.Index; > + unsigned semIndex = > + fulldecl->Semantic.Index + (reg - fulldecl->Range.First); > > /* only first 32 regs will appear in this bitfield */ > info->file_mask[file] |= (1 << reg); > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h > b/src/gallium/auxiliary/tgsi/tgsi_scan.h > index 0ea0e88..af4b128 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h > @@ -65,6 +65,10 @@ struct tgsi_shader_info > int file_max[TGSI_FILE_COUNT]; /**< highest index of declared registers > */ > int const_file_max[PIPE_MAX_CONSTANT_BUFFERS]; > > + ubyte input_array_first[PIPE_MAX_SHADER_INPUTS]; > + ubyte input_array_last[PIPE_MAX_SHADER_INPUTS]; > + ubyte output_array_first[PIPE_MAX_SHADER_OUTPUTS]; > + ubyte output_array_last[PIPE_MAX_SHADER_OUTPUTS]; > unsigned array_max[TGSI_FILE_COUNT]; /**< highest index array per > register file */ > > uint immediate_count; /**< number of immediates declared */ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 10.6.0 release candidate 2
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 The second release candidate for Mesa 10.6.0 is now available. The release mostly consists of nouveau driver patches, although some freedreno, i965, radeonsi, glapi and build fixes are also present. Alan Coopersmith (1): swrast: Build fix for Solaris Brian Paul (1): mesa: do not use _glapi_new_nop_table() for DRI builds Dave Airlie (1): u_math: uses assert, include assert.h Emil Velikov (2): targets/osmesa: drop the -module tag from LDFLAGS Increment version to 10.6.0-rc2 Ilia Mirkin (19): freedreno/ir3: fix immediate usage in tgsi tex fe freedreno/a3xx: set .zw of sprite coords to .01 nvc0/ir: set ftz when sources are floats, not just destinations nv50/ir: guess that the constant offset is the starting slot of array nvc0/ir: LOAD's can't be used for shader inputs nvc0: a geometry shader can have up to 1024 vertices output nv50/ir: avoid messing up arg1 of PFETCH nv30: don't leak fragprog consts nv30: avoid leaking render state and draw shaders nv30: fix clip plane uploads and enable changes nv30/draw: avoid leaving stale pointers in draw state nv30/draw: draw expects constbuf size in bytes, not vec4 units st/mesa: don't leak glsl_to_tgsi object on link failure glsl: avoid leaking linked gl_shader when there's a late linker error nv30/draw: fix indexed draws with swtnl path and a resource index buffer nv30/draw: only use the DMA1 object (GART) if the bo is not in VRAM nv30/draw: allocate vertex buffers in gart nv30/draw: switch varying hookup logic to know about texcoords nv30: falling back to draw path for edgeflag does no good Jason Ekstrand (1): i965/fs: Fix implied_mrf_writes for scratch writes Jeremy Huddleston Sequoia (1): darwin: Fix install name of libOSMesa Koop Mast (1): clover: Build fix for FreeBSD. Marek Olšák (2): cso: add context cleanup code from st/mesa radeonsi: fix scratch buffer setup for geometry shaders Neil Roberts (1): i965/skl: Add a message header for the TXF_MCS instruction in vec4vs Roland Scheidegger (1): llvmpipe: (trivial) add parantheses in (!x == y) expression git tag: mesa-10.6.0-rc2 ftp://ftp.freedesktop.org/pub/mesa/10.6.0/mesa-10.6.0-rc2.tar.gz MD5: b3726d9250f2e4093f03b603af20b5f4 mesa-10.6.0-rc2.tar.gz SHA1: ccffe808c26bcfbe7f58721d85bf4d728f4ce87b mesa-10.6.0-rc2.tar.gz SHA256: e83d46d0a70641f345021486eadcd6546085411e8ef836b2bf8276a184045fca mesa-10.6.0-rc2.tar.gz PGP: ftp://ftp.freedesktop.org/pub/mesa/10.6.0/mesa-10.6.0-rc2.tar.gz.sig ftp://ftp.freedesktop.org/pub/mesa/10.6.0/mesa-10.6.0-rc2.tar.xz MD5: cdc80cd971b0e81828e2f2acad86e367 mesa-10.6.0-rc2.tar.xz SHA1: a3c1b96fd859bc970400e06cbfe3faa50ad76dc7 mesa-10.6.0-rc2.tar.xz SHA256: eebf8d73352ce17533d690c8e36a78c8ed2ce1592e1bb961316104cda5edf9b2 mesa-10.6.0-rc2.tar.xz PGP: ftp://ftp.freedesktop.org/pub/mesa/10.6.0/mesa-10.6.0-rc2.tar.xz.sig - -- - -Emil -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJVZcdiAAoJEO2uN7As60kN2P4QAJYjitY069fUJn0gVZNSar+r mX0vDzFkzq+NloFOL83kSAMc42UscIUioDbLZfrwNeL6w0kCXQrgiQ7wwcw0y/U4 4TTSZMZDWci8siWsoPcPh8qjgGpsYcO9swh4DeiUHEZ0OcEbM+HRwycmnk4B1kpI 4WGymN8uwl/bayp11TGZrGQK6n/bMNXOfadw8xMnTu6b+0PszbQFfKfXj++aQSTZ p4TrJb+RLp7TBPfPz+r+YtLeRLlRFaIzcY0iCf8GaRmBht3o/dUsqICbQBX6KdOx cJzAiDSEVedl7TOhJuyWJJmLrKRk4XvrXGzZmOBbemQUV2vNd0UIlP8w3A+CfZiS J0LGPNoG4aFC+hDgVD/Ntls81VxkMKAOnJBM+Nk2+KknBWP/bJgEBUeoU9Qq+Mob qQW9DK66wbehC2pfTUftU/xZHvtHVOKAjFjJ96JX2m72thy66c0lpTmpAs7R2X3e OHs4QtktRe5yvQW9kNDGG6YVZqSNPiyM/U9Kj3x17y7kGdoD8lIa20KfAroKFn4R dLFXODfuexTNxXOG8xEgytmOTP3bHof/cR1kvR548aZ6ZaUuUyIjMQ6YStGzUWPy k9y2EHHTYO+BoYvawLoZdlB3vVnSP5CzQ4ir18bh6QAj7Y1GOcy6AFWdpiVFyXBE InMvMmxQUO2c8J5vqpr1 =fmEy -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: log build options when dumping clc source
EdB writes: > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 7d2d941..fc7ab9c 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -708,8 +708,10 @@ clover::compile_program_llvm(const std::string &source, > > llvm_ctx.setDiagnosticHandler(diagnostic_handler, &r_log); > > - if (get_debug_flags() & DBG_CLC) > - debug_log(source, ".cl"); > + if (get_debug_flags() & DBG_CLC) { > + const std::string src = "// Build options: " + opts + '\n' + source; > + debug_log(src, ".cl"); > + } Thank you, pushed with a minor clean-up. > > // The input file name must have the .cl extension in order for the > // CompilerInvocation class to recognize it as an OpenCL source file. > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
Hi, > 44 instances of a leading + in mesa/main compared to 78 trailing ones. > Huh, I was going to say that it's really uncommon to do this in mesa, but > that may not be supported by fact. Considering there is a formatting issue below, I can change it to trailing. I don’t care. >> + if (ymax < bbox[3]) { >> +bbox[3] = ymax; >indent looks messed up here. Sighs, because it is. I missed that line, and I have missed that one since v2. I really wish I had a script to check the indenting. Sighs. > I don't think the "extern" isn't really necessary here. A lot of older code > still has it, but you don't have to stick it in on newer additions. I can kill the extern, I don't care; however, all the previous existing functions in the header had extern on them. Is it better for consistency within the file to keep the extern? >> >> + > What's with the extra newline? Double sighs. Apparently, my git skills are lacking and I let that added line sneak in when I had killed it from Patch 4. > static inline GLuint > _mesa_geometric_height(const struct gl_framebuffer *buffer) { > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 04/10] mesa: add helper functions for geometry of gl_framebuffer
Hi, >> +static inline GLuint >Here and below, why 2 spaces between "inline" and "GLuint"? I have no clue. I suspect it is a scar from some search/replace fiasco over 3 weeks ago. You are the first person who spotted that nit. >> --- a/src/mesa/main/mtypes.h >> +++ b/src/mesa/main/mtypes.h >> @@ -3187,7 +3187,13 @@ struct gl_framebuffer >> * GL_ARB_framebuffer_no_attachments must check for the flag >> _HasAttachments >> * and if GL_FALSE, must then use the values in DefaultGeometry to >> initialize > > * its viewport, scissor and so on (in particular _Xmin, _Xmax, _Ymin and > >-* _Ymax do NOT take into account _HasAttachments being false) > >+* _Ymax do NOT take into account _HasAttachments being false). To get > >the > >+* geometry of the framebuffer, the helper functions > Why 2 spaces between "the" and "helper"? No clue. You are again the first person to spot that space. It would be nice to get a "fix these nits I found and get a reviewed by" badge. -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 04/10] mesa: add helper functions for geometry of gl_framebuffer
On Wed, May 27, 2015 at 9:21 AM, Rogovin, Kevin wrote: > Hi, >>> +static inline GLuint >>Here and below, why 2 spaces between "inline" and "GLuint"? > > I have no clue. I suspect it is a scar from some search/replace fiasco over > 3 weeks ago. You are the first person who spotted that nit. > >>> --- a/src/mesa/main/mtypes.h >>> +++ b/src/mesa/main/mtypes.h >>> @@ -3187,7 +3187,13 @@ struct gl_framebuffer >>> * GL_ARB_framebuffer_no_attachments must check for the flag >>> _HasAttachments >>> * and if GL_FALSE, must then use the values in DefaultGeometry to >>> initialize >> > * its viewport, scissor and so on (in particular _Xmin, _Xmax, _Ymin >> > and >> >-* _Ymax do NOT take into account _HasAttachments being false) >> >+* _Ymax do NOT take into account _HasAttachments being false). To get >> >the >> >+* geometry of the framebuffer, the helper functions > >> Why 2 spaces between "the" and "helper"? > > No clue. You are again the first person to spot that space. > > It would be nice to get a "fix these nits I found and get a reviewed by" > badge. Unfortunately I'm not in a great position to do that -- I'm good at noticing simple issues, but I'm not always familiar with the larger GL implications. Framebuffers are still a bit in the great unknown for me (but I think I've finally mastered the concept of a texture). But I glanced at your patches and spotted a couple of things that seemed odd, so figured I'd point them out. You should wait until you get a "real" review from someone before reposting. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 04/10] mesa: add helper functions for geometry of gl_framebuffer
> You should wait until you get a "real" review from someone before reposting. I think that is good advice. Indeed, I am not going to post a v5 of this series until for each patch there is: Two review bys, possibly qualified with "fix the mentioned formatting issues" OR Specific change requests. The patch series has not really changed in any meaningful way (in terms of compiled output) since v2. -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Do not use NoFramePointerElim with LLVM 3.7
On Tue, May 26, 2015 at 10:27:34PM -0700, Vinson Lee wrote: > TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244 > "Remove NoFramePointerElim and NoFramePointerElimOverride from > TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC." > Reviewed-by: Tom Stellard > Signed-off-by: Vinson Lee > --- > src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 2 ++ > src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp > b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp > index be3e834..76c302f 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp > +++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp > @@ -277,8 +277,10 @@ disassemble(const void* func, llvm::raw_ostream & Out) > options.StackAlignmentOverride = 4; > #endif > #if defined(DEBUG) || defined(PROFILE) > +#if HAVE_LLVM < 0x0307 > options.NoFramePointerElim = true; > #endif > +#endif > OwningPtr TM(T->createTargetMachine(Triple, > sys::getHostCPUName(), "", options)); > > /* > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp > b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp > index 5e8a634..ffed9e6 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp > +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp > @@ -439,8 +439,10 @@ > lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT, > #if HAVE_LLVM < 0x0304 > options.NoFramePointerElimNonLeaf = true; > #endif > +#if HAVE_LLVM < 0x0307 > options.NoFramePointerElim = true; > #endif > +#endif > > builder.setEngineKind(EngineKind::JIT) >.setErrorStr(&Error) > -- > 2.4.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Fix output swizzle in get_mul_for_src
Upon further consideration and actually seeing the patch, I think using num_components would be better. Calling it writemask isn't really true since it isn't the writemask for the mul. I thought about calling it read_mask but that isn't really true either because it isn't the components of the mul that get read either. It's only the write/read mask when combined with the swizzle. I guess we could call it swizzle_mask, but that just seems strange. Using the number of components nicely side-steps the whole problem. Since this pass fundamentally requires SSA, I don't think that's a problem. --Jason On Wed, May 27, 2015 at 1:10 AM, Iago Toral Quiroga wrote: > When we compute the output swizzle we want to consider the writemask of the > add operation, not the one from the multiplication. > --- > src/glsl/nir/nir_opt_peephole_ffma.c | 14 -- > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c > b/src/glsl/nir/nir_opt_peephole_ffma.c > index b430eac..c895c22 100644 > --- a/src/glsl/nir/nir_opt_peephole_ffma.c > +++ b/src/glsl/nir/nir_opt_peephole_ffma.c > @@ -73,7 +73,8 @@ are_all_uses_fadd(nir_ssa_def *def) > } > > static nir_alu_instr * > -get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool > *abs) > +get_mul_for_src(nir_alu_src *src, int writemask, > +uint8_t swizzle[4], bool *negate, bool *abs) > { > assert(src->src.is_ssa && !src->abs && !src->negate); > > @@ -85,16 +86,16 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], > bool *negate, bool *abs) > switch (alu->op) { > case nir_op_imov: > case nir_op_fmov: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); >break; > > case nir_op_fneg: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); >*negate = !*negate; >break; > > case nir_op_fabs: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); >*negate = false; >*abs = true; >break; > @@ -116,7 +117,7 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], > bool *negate, bool *abs) >return NULL; > > for (unsigned i = 0; i < 4; i++) { > - if (!(alu->dest.write_mask & (1 << i))) > + if (!(writemask & (1 << i))) > break; > >swizzle[i] = swizzle[src->swizzle[i]]; > @@ -160,7 +161,8 @@ nir_opt_peephole_ffma_block(nir_block *block, void > *void_state) > negate = false; > abs = false; > > - mul = get_mul_for_src(&add->src[add_mul_src], swizzle, &negate, > &abs); > + mul = get_mul_for_src(&add->src[add_mul_src], add->dest.write_mask, > + swizzle, &negate, &abs); > > if (mul != NULL) > break; > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/15 v2] mesa: Use the profile instead of an extension bit to validate GL_TEXTURE_CUBE_MAP
On Thursday 21 May 2015, Ian Romanick wrote: > From: Ian Romanick > > The extension on which this depends will always be enabled in core > profile, and the extension bit is about to be removed. > > Signed-off-by: Ian Romanick > Cc: "10.6" > Cc: Fredrik Höglund > --- > src/mesa/main/fbobject.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c > index 5b8c52a..c5a7026 100644 > --- a/src/mesa/main/fbobject.c > +++ b/src/mesa/main/fbobject.c > @@ -2707,6 +2707,10 @@ check_texture_target(struct gl_context *ctx, GLenum > target, > /* We're being called by glFramebufferTextureLayer(). > * The only legal texture types for that function are 3D, > * cube-map, and 1D/2D/cube-map array textures. > +* > +* We don't need to check for GL_ARB_texture_cube_map_array because the > +* application wouldn't have been able to create a texture with a > +* GL_TEXTURE_CUBE_MAP_ARRAY target if the extension were not enabled. > */ > switch (target) { > case GL_TEXTURE_3D: > @@ -2716,10 +2720,13 @@ check_texture_target(struct gl_context *ctx, GLenum > target, > case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: >return true; > case GL_TEXTURE_CUBE_MAP: > - /* This target is valid in TextureLayer when ARB_direct_state_access > - * or OpenGL 4.5 is supported. > + /* We don't need to check the extension (GL_ARB_direct_state_access) or > + * GL version (4.5) for GL_TEXTURE_CUBE_MAP because DSA is always > + * enabled in core profile. This can be called from > + * _mesa_FramebufferTextureLayer in compatibility profile (OpenGL 3.0), > + * so we do have to check the profile. > */ > - return ctx->Extensions.ARB_direct_state_access; > + return ctx->API == API_OPENGL_CORE; > } > > _mesa_error(ctx, GL_INVALID_OPERATION, > This patch, and patches 14 and 15 are: Reviewed-by: Fredrik Höglund Patches 1-12 are: Acked-by: Fredrik Höglund Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] configure.ac: enable building the EGL DRM platform by default
On Wed, May 27, 2015 at 3:21 AM, Emil Velikov wrote: > Hmm > > ./autogen.sh --with-gallium-drivers=radeonsi --enable-egl --enable-gbm > > seems to work fine here. Although... if I drop the --enable-gbm it > complains "EGL platform drm needs gbm", which is an entirely different > kettle of fish. I think I saw a radeonsi error, but I don't use --enable-egl and --enable-gbm. Ideally, --with-egl-platforms=xxx should enable EGL and --with-egl-platforms= should disable it. > > Fwiw I would opt to just enable gbm on supported platforms (everything > but Win/Mac) and then add drm to the list, based on --enable-gbm. If you > don't comfortable with it, I'll send patches as with follow up commit(s). Yes, please. Thanks, Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: enable building GLES1 and GLES2 by default
On Wed, May 27, 2015 at 4:53 AM, Emil Velikov wrote: > On 27 May 2015 at 11:23, Dave Airlie wrote: >>> Wow, I hadn't expected such a hateful comment on GLES1. >>> >>> Does anyone else want to convince me that GLES1 should burn in hell? >> >> So I dug around, >> >> commit 4c06853833996d990eb76b195ca5d6838c6f3d6b >> Author: Adam Jackson >> Date: Wed May 8 18:03:21 2013 -0400 >> >> Switch to Mesa master (pre 9.2) >> >> - Fix llvmpipe on big-endian and enable llvmpipe everywhere >> - Build vdpau drivers for r600/radeonsi/nouveau >> - Enable hardware floating-point texture support >> - Drop GLESv1, nothing's using it, let's not start >> >> So at least in Fedora 2 years ago, we realised there was no GLES1 >> users in the distro, >> and we didn't want to encourage any. >> >> I suppose some users might exist outside the classic Linux distro >> world. i.e. android, embedded land. >> > At least three other distros have GLES1 - Arch, Debian and OpenSUSE. > So imho one should just leave the decision to depreciate/kill it off > to the distros ? No one is suggesting deleting GLES1. I think distributions are going to inevitably ship things they have no reason to, regardless of our defaults, often because they don't know. The best we can do is guide their hands. If GLES1 is unused, we should note that in configure.ac to better communicate it to the distributions. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
On Wednesday, May 27, 2015 01:17:21 PM Rogovin, Kevin wrote: > Hi, > > > 44 instances of a leading + in mesa/main compared to 78 trailing ones. > > Huh, I was going to say that it's really uncommon to do this in mesa, but > > that may not be supported by fact. > > Considering there is a formatting issue below, I can change it to trailing. I > don’t care. > > >> + if (ymax < bbox[3]) { > >> +bbox[3] = ymax; > >indent looks messed up here. > > Sighs, because it is. I missed that line, and I have missed that one since > v2. I really wish I had a script to check the indenting. Sighs. Out of curiosity, what editor are you using? Usually setting the indentation rules in your editor takes care of most of these problems. If you're using vim, this will give you the correct settings for Mesa: if has("autocmd") au BufNewFile,BufRead */mesa/* set expandtab tabstop=8 softtabstop=3 shiftwidth=3 endif We have a .dir-locals.el file that should provide the correct settings for Emacs. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/5] i965/vs: Rework the logic for generating NIR from ARB vertex programs
On Wed, May 20, 2015 at 12:06 PM, Jason Ekstrand wrote: > Whether or not to use NIR is now equivalent to brw->scalar_vs. We can > simplify the logic and make it far less confusing. > --- I'm fine with removing the old visitor code and the old ARB fp backends. The series is Reviewed-by: Matt Turner Ken might want to have a quick look as well. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix binding validation for interface blocks
On Wed, May 27, 2015 at 3:53 AM, Timothy Arceri wrote: > Cc: Kenneth Graunke > --- > > It looks like this may never have actually worked, at least there > is no way it could have for inteface block arrays :) > > No piglit regressions. > > src/glsl/ast_to_hir.cpp | 22 ++ > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index 0ab664e..861bbfe 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -2041,10 +2041,10 @@ validate_matrix_layout_for_type(struct > _mesa_glsl_parse_state *state, > static bool > validate_binding_qualifier(struct _mesa_glsl_parse_state *state, > YYLTYPE *loc, > - ir_variable *var, > + const glsl_type *type, > const ast_type_qualifier *qual) > { > - if (var->data.mode != ir_var_uniform) { > + if (!qual->flags.q.uniform) { >_mesa_glsl_error(loc, state, > "the \"binding\" qualifier only applies to uniforms"); >return false; > @@ -2056,10 +2056,11 @@ validate_binding_qualifier(struct > _mesa_glsl_parse_state *state, > } > > const struct gl_context *const ctx = state->ctx; > - unsigned elements = var->type->is_array() ? var->type->length : 1; > + unsigned elements = type->is_array() ? type->length : 1; > unsigned max_index = qual->binding + elements - 1; > + const glsl_type *base_type = type->without_array(); > > - if (var->type->is_interface()) { > + if (base_type->is_interface()) { >/* UBOs. From page 60 of the GLSL 4.20 specification: > * "If the binding point for any uniform block instance is less than > zero, > * or greater than or equal to the implementation-dependent maximum > @@ -2077,8 +2078,7 @@ validate_binding_qualifier(struct > _mesa_glsl_parse_state *state, >ctx->Const.MaxUniformBufferBindings); > return false; >} > - } else if (var->type->is_sampler() || > - (var->type->is_array() && > var->type->fields.array->is_sampler())) { > + } else if (base_type->is_sampler()) { >/* Samplers. From page 63 of the GLSL 4.20 specification: > * "If the binding is less than zero, or greater than or equal to the > * implementation-dependent maximum supported number of units, a > @@ -2095,7 +2095,7 @@ validate_binding_qualifier(struct > _mesa_glsl_parse_state *state, > > return false; >} > - } else if (var->type->contains_atomic()) { > + } else if (base_type->contains_atomic()) { >assert(ctx->Const.MaxAtomicBufferBindings <= > MAX_COMBINED_ATOMIC_BUFFERS); >if (unsigned(qual->binding) >= ctx->Const.MaxAtomicBufferBindings) { > _mesa_glsl_error(loc, state, "layout(binding = %d) exceeds the " > @@ -2651,7 +2651,7 @@ apply_type_qualifier_to_variable(const struct > ast_type_qualifier *qual, > } > > if (qual->flags.q.explicit_binding && > - validate_binding_qualifier(state, loc, var, qual)) { > + validate_binding_qualifier(state, loc, var->type, qual)) { >var->data.explicit_binding = true; >var->data.binding = qual->binding; > } > @@ -5658,6 +5658,8 @@ ast_interface_block::hir(exec_list *instructions, > num_variables, > packing, > this->block_name); > + if (this->layout.flags.q.explicit_binding) > + validate_binding_qualifier(state, &loc, block_type, &this->layout); > > if (!state->symbols->add_interface(block_type->name, block_type, > var_mode)) { >YYLTYPE loc = this->get_location(); > @@ -5744,6 +5746,10 @@ ast_interface_block::hir(exec_list *instructions, > const glsl_type *block_array_type = > process_array_type(&loc, block_type, this->array_specifier, > state); > > + if(this->layout.flags.q.explicit_binding) Space after if ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] mesa/es3.1: Enable ES 3.1 API and shading language version
On 05/26/2015 10:37 PM, Ian Romanick wrote: From: Ian Romanick This is a bit of a hack for now. Several of the extensions required for OpenGL ES 3.1 have no support, at all, in Mesa. However, with this patch and a patch to allow MESA_GL_VERSION_OVERRIDE to work with ES contexts, people can begin testing the ES "version" of the functionality that is supported. Signed-off-by: Ian Romanick --- src/mesa/main/getstring.c | 16 src/mesa/main/version.c | 18 +- 2 files changed, 29 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c index 1b2c7f0..72d99ca 100644 --- a/src/mesa/main/getstring.c +++ b/src/mesa/main/getstring.c @@ -72,10 +72,18 @@ shading_language_version(struct gl_context *ctx) break; case API_OPENGLES2: - return (ctx->Version < 30) - ? (const GLubyte *) "OpenGL ES GLSL ES 1.0.16" - : (const GLubyte *) "OpenGL ES GLSL ES 3.00"; - + switch (ctx->Version) { + case 20: + return (const GLubyte *) "OpenGL ES GLSL ES 1.0.16"; There's revision 17 (1.0.17) out there, should we have it here or rather just 1.0? + case 30: + return (const GLubyte *) "OpenGL ES GLSL ES 3.00"; + case 31: + return (const GLubyte *) "OpenGL ES GLSL ES 3.10"; + default: + _mesa_problem(ctx, + "Invalid OpenGL ES version in shading_language_version()"); + return (const GLubyte *) 0; + } case API_OPENGLES: /* fall-through */ diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c index 699a0de..e817e2d 100644 --- a/src/mesa/main/version.c +++ b/src/mesa/main/version.c @@ -433,7 +433,23 @@ compute_version_es2(const struct gl_extensions *extensions) extensions->EXT_texture_snorm && extensions->NV_primitive_restart && extensions->OES_depth_texture_cube_map); - if (ver_3_0) { + const bool ver_3_1 = (ver_3_0 && + extensions->ARB_arrays_of_arrays && + extensions->ARB_compute_shader && + extensions->ARB_draw_indirect && + false /*extensions->ARB_framebuffer_no_attachments*/ && + extensions->ARB_shader_atomic_counters && + extensions->ARB_shader_image_load_store && + false /*extensions->ARB_shader_image_size*/ && + false /*extensions->ARB_shader_storage_buffer_object*/ && + extensions->ARB_shading_language_packing && + extensions->ARB_stencil_texturing && + extensions->ARB_gpu_shader5 && + extensions->EXT_shader_integer_mix); + + if (ver_3_1) { + return 31; + } else if (ver_3_0) { return 30; } else if (ver_2_0) { return 20; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 17/22] i965/gen9: Add XY_FAST_COPY_BLT support to intelEmitCopyBlit()
On Thu, May 21, 2015 at 05:46:48PM -0700, Anuj Phogat wrote: > On Thu, May 21, 2015 at 12:26 PM, Ben Widawsky wrote: SNIP > > As I mentioned above wrt the CL alignment, I think it's safe to do for both > > modes. Something like: > > > > /* Check both are at least cacheline aligned */ > > if (brw->gen >= 8 && (src_offset | dst_offset) & 63) > > return false; > > > We do this check for SRC_COPY_BLT in alignemnet_valid(). I added > the restrictions for FAST_COPY_BLT in fast_copy_blit_error_check(). > Doing this check for both satisy all alignment restrictions of both modes > except one in fast copy blit: when src is untiled and dst is tiled, dst > should be 64 byte aligned and src should be 16 bytes aligned. See > fast_copy_blit_error_check(). > But we're adding a restriction for it to be 64 byte by the suggested > change. Is there any advantage of the suggested check for both? > Oh hmm, I hadn't noticed a 16byte alignment requirement anywhere. I thought 64b was the minimum - if it's not ignore me. > >> if ((dst_y_tiled || src_y_tiled) && brw->gen < 6) > >>return false; > >> > >> assert(!dst_y_tiled || (dst_pitch % 128) == 0); > >> assert(!src_y_tiled || (src_pitch % 128) == 0); > >> > >> + if (use_fast_copy_blit == false && > >> + (src_tr_mode != INTEL_MIPTREE_TRMODE_NONE || > >> +dst_tr_mode != INTEL_MIPTREE_TRMODE_NONE)) > >> + return false; > >> + > > > > It seems like we should make it an error to use this improperly: > > > > /* It's an error for a caller to try to do a regular blit with a yf/ys > > region */ > > assert(use_fast_copy_blit && > >(src_tr_mode != INTEL_MIPTREE_TRMODE_NONE) && > >(dst_tr_mode != INTEL_MIPTREE_TRMODE_NONE)) > > > Agree. > >> /* do space check before going any further */ > >> do { > >> aper_array[0] = brw->batch.bo; > >> @@ -334,13 +498,6 @@ intelEmitCopyBlit(struct brw_context *brw, > >> src_buffer, src_pitch, src_offset, src_x, src_y, > >> dst_buffer, dst_pitch, dst_offset, dst_x, dst_y, w, h); > >> > >> - /* Blit pitch must be dword-aligned. Otherwise, the hardware appears > >> to drop > >> -* the low bits. Offsets must be naturally aligned. > >> -*/ > >> - if (src_pitch % 4 != 0 || src_offset % cpp != 0 || > >> - dst_pitch % 4 != 0 || dst_offset % cpp != 0) > >> - return false; > >> - > >> /* For big formats (such as floating point), do the copy using 16 or > >> 32bpp > >> * and multiply the coordinates. > >> */ > >> @@ -359,27 +516,76 @@ intelEmitCopyBlit(struct brw_context *brw, > >>} > >> } > >> > >> - BR13 = br13_for_cpp(cpp) | translate_raster_op(logic_op) << 16; > >> + if (use_fast_copy_blit && > >> + fast_copy_blit_error_check(src_offset, src_pitch, src_tiling, > >> + dst_offset, dst_pitch, dst_tiling, > >> cpp)) { > >> + BR13 = br13_for_cpp(cpp); > >> > > > > If you follow about you can remove the error check here as we know it's > > safe to > > proceed. > > > Right. As i understand you mean to move the fast_copy_blit_error_check() > inside > can_fast_copy_blit() helper ? > > >> - switch (cpp) { > >> - case 1: > >> - case 2: > >> - CMD = XY_SRC_COPY_BLT_CMD; > >> - break; > >> - case 4: > >> - CMD = XY_SRC_COPY_BLT_CMD | XY_BLT_WRITE_ALPHA | XY_BLT_WRITE_RGB; > >> - break; > >> - default: > >> - return false; > >> - } > >> + if (src_tr_mode == INTEL_MIPTREE_TRMODE_YF) > >> + BR13 |= XY_SRC_TRMODE_YF; > >> > >> - if (dst_tiling != I915_TILING_NONE) { > >> - CMD |= XY_DST_TILED; > >> - dst_pitch /= 4; > >> - } > >> - if (src_tiling != I915_TILING_NONE) { > >> - CMD |= XY_SRC_TILED; > >> - src_pitch /= 4; > >> + if (dst_tr_mode == INTEL_MIPTREE_TRMODE_YF) > >> + BR13 |= XY_DST_TRMODE_YF; > >> + > >> + CMD = XY_FAST_COPY_BLT_CMD; > >> + > >> + if (dst_tiling != I915_TILING_NONE) { > >> + SET_TILING_XY_FAST_COPY_BLT(dst_tiling, dst_tr_mode, XY_DST); > >> + /* Pitch value should be specified as a number of Dwords. */ > >> + dst_pitch /= 4; > >> + } > >> + if (src_tiling != I915_TILING_NONE) { > >> + SET_TILING_XY_FAST_COPY_BLT(src_tiling, src_tr_mode, XY_SRC); > >> + /* Pitch value should be specified as a number of Dwords. */ > >> + src_pitch /= 4; > >> + } > >> + > >> + CMD |= get_tr_horizontal_align(src_tr_mode, cpp, true /* is_src */); > >> + CMD |= get_tr_vertical_align(src_tr_mode, cpp, true /* is_src */); > >> + > >> + CMD |= get_tr_horizontal_align(dst_tr_mode, cpp, false /* is_src > >> */); > >> + CMD |= get_tr_vertical_align(dst_tr_mode, cpp, false /* is_src */); > >> + > >> + } else { > >> + /* Source and destination base addresses should be 4 KB aligned. */ > >> + if (dst_tiling != I915_TILING_NONE) { > >> + if (dst_offset & 4095) > >> +return fal
Re: [Mesa-dev] [PATCH 0/5] Remainder of GLES3.1 foundation work
Everything looks good, this will enable us to work on bunch of small corners of gles 3.1 even though some of the extensions are still being baked. I sent small comment to patch #2 which you can feel free to ignore, I've been reading ES spec a lot lately so happened to notice that I'm reading more recent one. Series is Reviewed-by: Tapani Pälli On 05/26/2015 10:37 PM, Ian Romanick wrote: The first two patches are just re-re-sends of the same patches. Based on feedback from Emil and others, I changed the later patches to use a new environment variable, MESA_GLES_VERSION_OVERRIDE, to override GLES versions. I considered naming this MESA_GLES2_VERSION_OVERRIDE so that we could later add support for ES1 overrides. I decided against this because 1. I don't think anyone cares two hoots about ES1. 2. Everyone will typo MESA_GLES_VERSION_OVERRIDE instead of MESA_GLES2_VERSION_OVERRIDE anyway. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
On 27 May 2015 at 19:05, Kenneth Graunke wrote: > If you're using vim, this will give you the correct settings for Mesa: > > if has("autocmd") > au BufNewFile,BufRead */mesa/* set expandtab tabstop=8 softtabstop=3 > shiftwidth=3 > endif > setlocal is more appropriate than set for this kind of thing (because it only applies to the current buffer), although I suppose it doesn't matter that much in practice. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Fix lowering of integer multiplication with cmod.
If the multiplication's result is unused, except by a conditional_mod, the destination will be null. Since the final instruction in the lowered sequence is a partial-write, we can't put the conditional mod on it and we have to store the full result to a register and do a MOV with a conditional mod. Cc: "10.6" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580 --- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 42a0d78..5cc4fe6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3606,6 +3606,10 @@ fs_visitor::lower_integer_multiplication() * schedule multi-component multiplications much better. */ + if (inst->conditional_mod && inst->dst.is_null()) { +inst->dst = fs_reg(GRF, alloc.allocate(dispatch_width / 8), + inst->dst.type, dispatch_width); + } fs_reg low = inst->dst; fs_reg high(GRF, alloc.allocate(dispatch_width / 8), inst->dst.type, dispatch_width); @@ -3655,6 +3659,13 @@ fs_visitor::lower_integer_multiplication() low.stride = 2; insert(ADD(dst, low, high)); + + if (inst->conditional_mod) { +fs_reg null(retype(brw_null_reg(), inst->dst.type)); +fs_inst *mov = MOV(null, inst->dst); +mov->conditional_mod = inst->conditional_mod; +insert(mov); + } } #undef insert -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/15] egl: combine VersionMajor and VersionMinor into one variable
On Wed 13 May 2015, Marek Olšák wrote: > From: Marek Olšák > > --- > src/egl/main/eglapi.c | 9 - > src/egl/main/egldisplay.h | 3 +-- > 2 files changed, 5 insertions(+), 7 deletions(-) Patches 1-3 are Reviewed-by: Chad Versace I'm looking at the other patches right now. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
On 05/27/2015 04:21 AM, Ilia Mirkin wrote: > More trivia: > > On Wed, May 27, 2015 at 5:49 AM, Kevin Rogovin > wrote: >> Add helper convenience function that intersects the scissor values >> against a passed bounding box. In addition, to avoid replicated code, >> make the function _mesa_scissor_bounding_box() use this new function. >> >> v2: >> Split from patch "mesa:helper-conveniance functions for drivers to >> implement ARB_framebuffer_no_attachment". >> White space and long line fixes. >> >> v3: >> No changes. >> >> v4: >> No changes. >> >> >> Signed-off-by: Kevin Rogovin >> --- >> src/mesa/main/framebuffer.c | 63 >> +++-- >> src/mesa/main/framebuffer.h | 5 >> 2 files changed, 48 insertions(+), 20 deletions(-) >> >> diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c >> index c2cfb92..dd9e4bc 100644 >> --- a/src/mesa/main/framebuffer.c >> +++ b/src/mesa/main/framebuffer.c >> @@ -357,42 +357,38 @@ update_framebuffer_size(struct gl_context *ctx, struct >> gl_framebuffer *fb) >> } >> >> >> + >> /** >> - * Calculate the inclusive bounding box for the scissor of a specific >> viewport >> + * Given a bounding box, intersect the bounding box with the scissor of >> + * a specified vieport. >> * >> * \param ctx GL context. >> - * \param buffer Framebuffer to be checked against >> * \param idx Index of the desired viewport >> * \param bboxBounding box for the scissored viewport. Stored as xmin, >> *xmax, ymin, ymax. >> - * >> - * \warning This function assumes that the framebuffer dimensions are up to >> - * date (e.g., update_framebuffer_size has been recently called on \c >> buffer). >> - * >> - * \sa _mesa_clip_to_region >> */ >> -void >> -_mesa_scissor_bounding_box(const struct gl_context *ctx, >> - const struct gl_framebuffer *buffer, >> - unsigned idx, int *bbox) >> +extern void >> +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx, >> + unsigned idx, int *bbox) >> { >> - bbox[0] = 0; >> - bbox[2] = 0; >> - bbox[1] = buffer->Width; >> - bbox[3] = buffer->Height; >> - >> if (ctx->Scissor.EnableFlags & (1u << idx)) { >> + int xmax, ymax; >> + >> + xmax = ctx->Scissor.ScissorArray[idx].X >> ++ ctx->Scissor.ScissorArray[idx].Width; > > 44 instances of a leading + in mesa/main compared to 78 trailing ones. > Huh, I was going to say that it's really uncommon to do this in mesa, > but that may not be supported by fact. I wrote a lot of the leading-operator cases. I think most or all of those are cases where the operator is in an if-condition. Those cases improve readability because of how the indentation lines up if (x + y) z; vs. if (x + y) z; It's more obvious that the second line is still part of the condition when you're skimming. Matt has made the argument that (and I'm paraphrasing) like Yoda conditions, it may technically be better, but it's weird... and weird is usually bad. *shrug* ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] mesa/es3.1: Enable ES 3.1 API and shading language version
On 05/27/2015 10:36 AM, Tapani wrote: > On 05/26/2015 10:37 PM, Ian Romanick wrote: >> From: Ian Romanick >> >> This is a bit of a hack for now. Several of the extensions required for >> OpenGL ES 3.1 have no support, at all, in Mesa. However, with this >> patch and a patch to allow MESA_GL_VERSION_OVERRIDE to work with ES >> contexts, people can begin testing the ES "version" of the functionality >> that is supported. >> >> Signed-off-by: Ian Romanick >> --- >> src/mesa/main/getstring.c | 16 >> src/mesa/main/version.c | 18 +- >> 2 files changed, 29 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c >> index 1b2c7f0..72d99ca 100644 >> --- a/src/mesa/main/getstring.c >> +++ b/src/mesa/main/getstring.c >> @@ -72,10 +72,18 @@ shading_language_version(struct gl_context *ctx) >> break; >>case API_OPENGLES2: >> - return (ctx->Version < 30) >> - ? (const GLubyte *) "OpenGL ES GLSL ES 1.0.16" >> - : (const GLubyte *) "OpenGL ES GLSL ES 3.00"; >> - >> + switch (ctx->Version) { >> + case 20: >> + return (const GLubyte *) "OpenGL ES GLSL ES 1.0.16"; > > There's revision 17 (1.0.17) out there, should we have it here or rather > just 1.0? We could bump it or leave it. I don't think it matters much either way. Section 6.1.5 (String Queries) of the OpenGL ES 2.0.25 spec says: "The SHADING_LANGUAGE_VERSION string is laid out as follows: "OpenGL ES GLSL ES N.M vendor-specific information" The version number is either of the form major_number.minor_number or major_number.minor_number.release_number, where the numbers all have one or more digits. The release number and vendor specific information are optional. However, if present, then they pertain to the server and their format and contents are implementation- dependent." >> + case 30: >> + return (const GLubyte *) "OpenGL ES GLSL ES 3.00"; >> + case 31: >> + return (const GLubyte *) "OpenGL ES GLSL ES 3.10"; >> + default: >> + _mesa_problem(ctx, >> + "Invalid OpenGL ES version in >> shading_language_version()"); >> + return (const GLubyte *) 0; >> + } >> case API_OPENGLES: >> /* fall-through */ >> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c >> index 699a0de..e817e2d 100644 >> --- a/src/mesa/main/version.c >> +++ b/src/mesa/main/version.c >> @@ -433,7 +433,23 @@ compute_version_es2(const struct gl_extensions >> *extensions) >>extensions->EXT_texture_snorm && >>extensions->NV_primitive_restart && >>extensions->OES_depth_texture_cube_map); >> - if (ver_3_0) { >> + const bool ver_3_1 = (ver_3_0 && >> + extensions->ARB_arrays_of_arrays && >> + extensions->ARB_compute_shader && >> + extensions->ARB_draw_indirect && >> + false >> /*extensions->ARB_framebuffer_no_attachments*/ && >> + extensions->ARB_shader_atomic_counters && >> + extensions->ARB_shader_image_load_store && >> + false /*extensions->ARB_shader_image_size*/ && >> + false >> /*extensions->ARB_shader_storage_buffer_object*/ && >> + extensions->ARB_shading_language_packing && >> + extensions->ARB_stencil_texturing && >> + extensions->ARB_gpu_shader5 && >> + extensions->EXT_shader_integer_mix); >> + >> + if (ver_3_1) { >> + return 31; >> + } else if (ver_3_0) { >> return 30; >> } else if (ver_2_0) { >> return 20; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/15] egl: import headers from Khronos EGL registry
On Thu 14 May 2015, Emil Velikov wrote: > Hi Marek, > On 12/05/15 22:54, Marek Olšák wrote: > > From: Marek Olšák > > > > with the extension of keeping: > > #define KHRONOS_APICALL __attribute__((visibility("default"))) > > > > And don't include mesa headers in egl.h. > Can we do this more gradually (like below). It will ease the conflicts > that this patch might cause. > > - egl.h - should have no side effects > - eglext.h and eglmesaext.h - will clearly illustrate the parts that > have been moved from the latter to the former. > - eglplatform.h and khrplatform.h - there are a few controversial > changes which might cause breakage. I want to see this patch broken up too. Currently, the patch contains a straightforward header update layered on top of Marek's header modifications. The two should be split into separate patches so it's clear which changes are which. > > --- > > include/EGL/egl.h | 562 > > + > > include/EGL/eglext.h | 259 +++-- > > include/EGL/eglplatform.h | 45 +--- > > include/KHR/khrplatform.h | 25 +- > > src/egl/main/egltypedefs.h | 2 + > > 5 files changed, 540 insertions(+), 353 deletions(-) > > > > > diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h > > index 2eb6865..1284089 100644 > > --- a/include/EGL/eglplatform.h > > +++ b/include/EGL/eglplatform.h > > > -#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */ > > +#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__) > > /* Symbian */ > > > > typedef int EGLNativeDisplayType; > > typedef void *EGLNativeWindowType; > > typedef void *EGLNativePixmapType; > > > > -#elif defined(WL_EGL_PLATFORM) > > +#elif defined(__ANDROID__) || defined(ANDROID) > > > > -typedef struct wl_display *EGLNativeDisplayType; > > -typedef struct wl_egl_pixmap *EGLNativePixmapType; > > -typedef struct wl_egl_window *EGLNativeWindowType; > > +#include > > > > -#elif defined(__GBM__) > > - > > -typedef struct gbm_device *EGLNativeDisplayType; > > -typedef struct gbm_bo *EGLNativePixmapType; > > -typedef void *EGLNativeWindowType; > > - > > -#elif defined(ANDROID) /* Android */ > > - > > -struct ANativeWindow; > > struct egl_native_pixmap_t; > > > > -typedef struct ANativeWindow*EGLNativeWindowType; > > -typedef struct egl_native_pixmap_t *EGLNativePixmapType; > > -typedef void*EGLNativeDisplayType; > > +typedef struct ANativeWindow* EGLNativeWindowType; > > +typedef struct egl_native_pixmap_t* EGLNativePixmapType; > > +typedef void* EGLNativeDisplayType; > > > > #elif defined(__unix__) > > > > -#if defined(MESA_EGL_NO_X11_HEADERS) > > - > > -typedef void*EGLNativeDisplayType; > > -typedef khronos_uintptr_t EGLNativePixmapType; > > -typedef khronos_uintptr_t EGLNativeWindowType; > > - > > -#else > > - > > /* X11 (tentative) */ > > #include > > #include > > @@ -122,18 +103,8 @@ typedef Display *EGLNativeDisplayType; > > typedef Pixmap EGLNativePixmapType; > > typedef Window EGLNativeWindowType; > > > > -#endif /* MESA_EGL_NO_X11_HEADERS */ > > - > > -#elif __HAIKU__ > > -#include > > -typedef void *EGLNativeDisplayType; > > -typedef khronos_uintptr_t EGLNativePixmapType; > > -typedef khronos_uintptr_t EGLNativeWindowType; > > - > Upon closer look, one could get away with the above changes, although > there may be something more subtle to it. > > Kristian, Chad, > > Would you have any suggestions for/against nuking the Wayland/GBM/other > typedefs ? With an extra eye on the Haiku changes, what would it take to > get the current eglplatform.h (or equivalent) accepted with Khronos ? I'm against. Khronos intended eglplatform.h specifically for this purpose. To quote the Khronos API Implementer's Guide: Implementers may need to modify eglplatform.h. In particular the eglNativeDisplayType, eglNativeWindowType, and eglNativePixmapType typedefs must be defined as appropriate for the platform (typically, they will be typedef'ed to corresponding types in the native window system). Developer documentation should mention the correspondence so that developers know what parameters to pass to eglCreateWindowSurface, eglCreatePixmapSurface, and eglCopyBuffers. Documentation should also describe the format of the display_id parameter to eglGetDisplay, since this is a platform-specific identifier. [https://www.khronos.org/registry/implementers_guide.html] And the EGL 1.5 spec says this about eglplatform.h: All platform-specific types, values, and macros used in egl.h are partitioned into a platform header, eglplatform.h, which is automatically included by egl.h. [...] Implementers should need to modify only eglplatform.h, never egl.h 2 . Therefore eglplatform.h should keep the typedefs f
Re: [Mesa-dev] [PATCH 2/5] i965: Make fs/vec4_visitor inherit from ir_visitor directly
On Wednesday, May 20, 2015 11:09:03 AM Jason Ekstrand wrote: > This is using multiple inheritance in C++. However, ir_visitor is really > just an interface with no data so it shouldn't be so bad. You might also mention that this is a temporary measure that's going away in a few commits. Series is: Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] tgsi: add support for geometry shader streams.
Hi, Am 27/05/2015 um 09:45 schrieb Dave Airlie: > This adds support to retrieve the primitive counts > for each stream, along with the offset for each > primitive into the output array. > > It also adds support for parsing the stream argument > to the emit and end instructions. > > Signed-off-by: Dave Airlie > --- > src/gallium/auxiliary/tgsi/tgsi_exec.c | 59 > ++ > src/gallium/auxiliary/tgsi/tgsi_exec.h | 14 ++-- > 2 files changed, 58 insertions(+), 15 deletions(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c > b/src/gallium/auxiliary/tgsi/tgsi_exec.c > index a098a82..f080385 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c > @@ -706,7 +706,22 @@ enum tgsi_exec_datatype { > #define TEMP_OUTPUT_C TGSI_EXEC_TEMP_OUTPUT_C > #define TEMP_PRIMITIVE_I TGSI_EXEC_TEMP_PRIMITIVE_I > #define TEMP_PRIMITIVE_C TGSI_EXEC_TEMP_PRIMITIVE_C > - > +#define TEMP_PRIMITIVE_S1_I TGSI_EXEC_TEMP_PRIMITIVE_S1_I > +#define TEMP_PRIMITIVE_S1_C TGSI_EXEC_TEMP_PRIMITIVE_S1_C > +#define TEMP_PRIMITIVE_S2_I TGSI_EXEC_TEMP_PRIMITIVE_S2_I > +#define TEMP_PRIMITIVE_S2_C TGSI_EXEC_TEMP_PRIMITIVE_S2_C > +#define TEMP_PRIMITIVE_S3_I TGSI_EXEC_TEMP_PRIMITIVE_S3_I > +#define TEMP_PRIMITIVE_S3_C TGSI_EXEC_TEMP_PRIMITIVE_S3_C > + > +static const struct { > + int idx; > + int chan; > +} temp_prim_idxs[] = { > + { TEMP_PRIMITIVE_I, TEMP_PRIMITIVE_C }, > + { TEMP_PRIMITIVE_S1_I, TEMP_PRIMITIVE_S1_C }, > + { TEMP_PRIMITIVE_S2_I, TEMP_PRIMITIVE_S2_C }, > + { TEMP_PRIMITIVE_S3_I, TEMP_PRIMITIVE_S3_C }, > +}; Is that last comma intentional? Regards Michael > > /** The execution mask depends on the conditional mask and the loop mask */ > #define UPDATE_EXEC_MASK(MACH) \ > @@ -1848,35 +1863,51 @@ exec_kill(struct tgsi_exec_machine *mach, > } > > static void > -emit_vertex(struct tgsi_exec_machine *mach) > +emit_vertex(struct tgsi_exec_machine *mach, > +const struct tgsi_full_instruction *inst) > { > + union tgsi_exec_channel r[1]; > + unsigned stream_id; > + unsigned *prim_count; > /* FIXME: check for exec mask correctly > unsigned i; > for (i = 0; i < TGSI_QUAD_SIZE; ++i) { > if ((mach->ExecMask & (1 << i))) > */ > + IFETCH(&r[0], 0, TGSI_CHAN_X); > + stream_id = r[0].u[0]; > + prim_count = > &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream_id].chan].u[0]; > if (mach->ExecMask) { > - if > (mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]] > >= mach->MaxOutputVertices) > + if (mach->Primitives[stream_id][*prim_count] >= > mach->MaxOutputVertices) > return; > > + mach->PrimitiveOffsets[stream_id][*prim_count] = > mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0]; >mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0] += > mach->NumOutputs; > - > mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]]++; > + mach->Primitives[stream_id][*prim_count]++; > } > } > > static void > -emit_primitive(struct tgsi_exec_machine *mach) > +emit_primitive(struct tgsi_exec_machine *mach, > + const struct tgsi_full_instruction *inst) > { > - unsigned *prim_count = > &mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]; > + unsigned *prim_count; > + union tgsi_exec_channel r[1]; > + unsigned stream_id = 0; > /* FIXME: check for exec mask correctly > unsigned i; > for (i = 0; i < TGSI_QUAD_SIZE; ++i) { > if ((mach->ExecMask & (1 << i))) > */ > + if (inst) { > + IFETCH(&r[0], 0, TGSI_CHAN_X); > + stream_id = r[0].u[0]; > + } > + prim_count = > &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream_id].chan].u[0]; > if (mach->ExecMask) { >++(*prim_count); >debug_assert((*prim_count * mach->NumOutputs) < > mach->MaxGeometryShaderOutputs); > - mach->Primitives[*prim_count] = 0; > + mach->Primitives[stream_id][*prim_count] = 0; > } > } > > @@ -1885,9 +1916,9 @@ conditional_emit_primitive(struct tgsi_exec_machine > *mach) > { > if (TGSI_PROCESSOR_GEOMETRY == mach->Processor) { >int emitted_verts = > - > mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]]; > + > mach->Primitives[0][mach->Temps[temp_prim_idxs[0].idx].xyzw[temp_prim_idxs[0].chan].u[0]]; >if (emitted_verts) { > - emit_primitive(mach); > + emit_primitive(mach, NULL); >} > } > } > @@ -4596,11 +4627,11 @@ exec_instruction( >break; > > case TGSI_OPCODE_EMIT: > - emit_vertex(mach); > + emit_vertex(mach, inst); >break; > > case TGSI_OPCODE_ENDPRIM: > - emit_primitive(mach); > + emit_primitive(mach, inst); >break; > > case TGSI_OPCODE_BGNLOOP: > @@ -5061,8 +5092,10 @@ tgsi_exec_machine_run( struct tgsi_exec_machi
Re: [Mesa-dev] [PATCH 3/7] tgsi: add support for geometry shader streams.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Am 27.05.2015 um 09:45 schrieb Dave Airlie: > This adds support to retrieve the primitive counts for each stream, > along with the offset for each primitive into the output array. > > It also adds support for parsing the stream argument to the emit > and end instructions. > > Signed-off-by: Dave Airlie --- > src/gallium/auxiliary/tgsi/tgsi_exec.c | 59 > ++ > src/gallium/auxiliary/tgsi/tgsi_exec.h | 14 ++-- 2 files > changed, 58 insertions(+), 15 deletions(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c > b/src/gallium/auxiliary/tgsi/tgsi_exec.c index a098a82..f080385 > 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ > b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -706,7 +706,22 @@ enum > tgsi_exec_datatype { #define TEMP_OUTPUT_C > TGSI_EXEC_TEMP_OUTPUT_C #define TEMP_PRIMITIVE_I > TGSI_EXEC_TEMP_PRIMITIVE_I #define TEMP_PRIMITIVE_C > TGSI_EXEC_TEMP_PRIMITIVE_C - +#define TEMP_PRIMITIVE_S1_I > TGSI_EXEC_TEMP_PRIMITIVE_S1_I +#define TEMP_PRIMITIVE_S1_C > TGSI_EXEC_TEMP_PRIMITIVE_S1_C +#define TEMP_PRIMITIVE_S2_I > TGSI_EXEC_TEMP_PRIMITIVE_S2_I +#define TEMP_PRIMITIVE_S2_C > TGSI_EXEC_TEMP_PRIMITIVE_S2_C +#define TEMP_PRIMITIVE_S3_I > TGSI_EXEC_TEMP_PRIMITIVE_S3_I +#define TEMP_PRIMITIVE_S3_C > TGSI_EXEC_TEMP_PRIMITIVE_S3_C + +static const struct { + int > idx; + int chan; +} temp_prim_idxs[] = { + { TEMP_PRIMITIVE_I, > TEMP_PRIMITIVE_C }, + { TEMP_PRIMITIVE_S1_I, TEMP_PRIMITIVE_S1_C > }, + { TEMP_PRIMITIVE_S2_I, TEMP_PRIMITIVE_S2_C }, + { > TEMP_PRIMITIVE_S3_I, TEMP_PRIMITIVE_S3_C }, +}; Is the last comma intentional? Regards Michael > > /** The execution mask depends on the conditional mask and the loop > mask */ #define UPDATE_EXEC_MASK(MACH) \ @@ -1848,35 +1863,51 @@ > exec_kill(struct tgsi_exec_machine *mach, } > > static void -emit_vertex(struct tgsi_exec_machine *mach) > +emit_vertex(struct tgsi_exec_machine *mach, +const > struct tgsi_full_instruction *inst) { + union tgsi_exec_channel > r[1]; + unsigned stream_id; + unsigned *prim_count; /* FIXME: > check for exec mask correctly unsigned i; for (i = 0; i < > TGSI_QUAD_SIZE; ++i) { if ((mach->ExecMask & (1 << i))) */ + > IFETCH(&r[0], 0, TGSI_CHAN_X); + stream_id = r[0].u[0]; + > prim_count = > &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream _id].chan].u[0]; > > if (mach->ExecMask) { > - if > (mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C] .u[0]] > >= mach->MaxOutputVertices) + if > (mach->Primitives[stream_id][*prim_count] >= > mach->MaxOutputVertices) return; > > + mach->PrimitiveOffsets[stream_id][*prim_count] = > mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0]; > mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0] += > mach->NumOutputs; - > mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C]. u[0]]++; > > + mach->Primitives[stream_id][*prim_count]++; > } } > > static void -emit_primitive(struct tgsi_exec_machine *mach) > +emit_primitive(struct tgsi_exec_machine *mach, + > const struct tgsi_full_instruction *inst) { - unsigned > *prim_count = > &mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0]; + > unsigned *prim_count; + union tgsi_exec_channel r[1]; + > unsigned stream_id = 0; /* FIXME: check for exec mask correctly > unsigned i; for (i = 0; i < TGSI_QUAD_SIZE; ++i) { if > ((mach->ExecMask & (1 << i))) */ + if (inst) { + > IFETCH(&r[0], 0, TGSI_CHAN_X); + stream_id = r[0].u[0]; + } > + prim_count = > &mach->Temps[temp_prim_idxs[stream_id].idx].xyzw[temp_prim_idxs[stream _id].chan].u[0]; > > if (mach->ExecMask) { > ++(*prim_count); debug_assert((*prim_count * mach->NumOutputs) < > mach->MaxGeometryShaderOutputs); - > mach->Primitives[*prim_count] = 0; + > mach->Primitives[stream_id][*prim_count] = 0; } } > > @@ -1885,9 +1916,9 @@ conditional_emit_primitive(struct > tgsi_exec_machine *mach) { if (TGSI_PROCESSOR_GEOMETRY == > mach->Processor) { int emitted_verts = - > mach->Primitives[mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C]. u[0]]; > > + mach->Primitives[0][mach->Temps[temp_prim_idxs[0].idx].xyzw[temp_prim_id xs[0].chan].u[0]]; > if (emitted_verts) { - emit_primitive(mach); + > emit_primitive(mach, NULL); } } } @@ -4596,11 +4627,11 @@ > exec_instruction( break; > > case TGSI_OPCODE_EMIT: - emit_vertex(mach); + > emit_vertex(mach, inst); break; > > case TGSI_OPCODE_ENDPRIM: - emit_primitive(mach); + > emit_primitive(mach, inst); break; > > case TGSI_OPCODE_BGNLOOP: @@ -5061,8 +5092,10 @@ > tgsi_exec_machine_run( struct tgsi_exec_machine *mach ) > mach->Temps[TEMP_OUTPUT_I].xyzw[TEMP_OUTPUT_C].u[0] = 0; > > if( mach->Processor == TGSI_PROCESSOR_GEOMETRY ) { - > mach->Temps[TEMP_PRIMITIVE_I].xyzw[TEMP_PRIMITIVE_C].u[0] = 0; - > mach->Primitives[0] = 0; + for (i = 0; i < > TGSI_MAX_VERTEX_STREAMS; i++) { + > mach->Temps[temp_prim_idxs[i].idx].xy
Re: [Mesa-dev] [PATCH 09/15] egl: add EGL 1.5 functions that don't need any changes from extensions
On Wed 13 May 2015, Marek Olšák wrote: > From: Marek Olšák > > Declare the functions without the suffix, so that the core names are exported. > --- > src/egl/main/eglapi.c | 20 > 1 file changed, 12 insertions(+), 8 deletions(-) Patches 5-9 are Reviewed-by: Chad Versace I'm still reading through the remaining patches. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/15] egl: add eglWaitSync
On Fri 15 May 2015, Emil Velikov wrote: > On 15/05/15 16:18, Marek Olšák wrote: > > On Fri, May 15, 2015 at 1:23 AM, Emil Velikov > > wrote: > >> On 12/05/15 22:54, Marek Olšák wrote: > >>> From: Marek Olšák > >>> > >>> --- > >>> src/egl/main/eglapi.c | 12 > >>> 1 file changed, 12 insertions(+) > >>> > >>> diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c > >>> index 60df297..544f7e4 100644 > >>> --- a/src/egl/main/eglapi.c > >>> +++ b/src/egl/main/eglapi.c > >>> @@ -1162,6 +1162,7 @@ eglGetProcAddress(const char *procname) > >>>{ "eglDestroySync", (_EGLProc) eglDestroySync }, > >>>{ "eglClientWaitSync", (_EGLProc) eglClientWaitSync }, > >>>{ "eglDestroyImage", (_EGLProc) eglDestroyImage }, > >>> + { "eglWaitSync", (_EGLProc) eglWaitSync }, > >>> #endif /* _EGL_GET_CORE_ADDRESSES */ > >>> #ifdef EGL_MESA_drm_display > >>>{ "eglGetDRMDisplayMESA", (_EGLProc) eglGetDRMDisplayMESA }, > >>> @@ -1514,6 +1515,17 @@ eglWaitSyncKHR(EGLDisplay dpy, EGLSync sync, > >>> EGLint flags) > >>> > >>> > >>> EGLBoolean EGLAPIENTRY > >>> +eglWaitSync(EGLDisplay dpy, EGLSync sync, EGLint flags) > >>> +{ > >>> + /* The KHR version returns EGLint, while the core version returns > >>> +* EGLBoolean. In both cases, the return values can only be EGL_FALSE > >>> and > >>> +* EGL_TRUE. > >>> +*/ > >>> + return eglWaitSyncKHR(dpy, sync, flags); > >> Maybe add the cast, considering that sizeof(EGLenum) != sizeof(EGLInt) > >> on 64 bit systems. > >> > >> Thinking about the assembly - not sure if won't end up with pushing a > >> 64bit value, while a 32bit one being popped. My x86 assembly days were > >> mostly before x86_64 became a thing so ^^ might sound a bit stupid :) > > > > It's EGLBoolean, not EGLenum. Also, no cast is needed between basic types. > > > > typedef int32_t khronos_int32_t; > > typedef khronos_int32_t EGLint; > > typedef unsigned int EGLBoolean; > > > > The types are equivalent on 32-bit and 64-bit, so I could have just > > made eglWaitSyncKHR be an alias of eglWaitSync and it would work, but > > I wanted to document this in the code. > > > Had a bit of a brain fart. Pardon for the noise. Since eglWaitSyncKHR returns only EGL_TRUE or EGL_FALSE, this patch looks good to me. Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix lowering of integer multiplication with cmod.
On Wednesday, May 27, 2015 10:58:01 AM Matt Turner wrote: > If the multiplication's result is unused, except by a conditional_mod, > the destination will be null. Since the final instruction in the lowered > sequence is a partial-write, we can't put the conditional mod on it and > we have to store the full result to a register and do a MOV with a > conditional mod. > > Cc: "10.6" > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580 > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 42a0d78..5cc4fe6 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -3606,6 +3606,10 @@ fs_visitor::lower_integer_multiplication() >* schedule multi-component multiplications much better. >*/ > > + if (inst->conditional_mod && inst->dst.is_null()) { > +inst->dst = fs_reg(GRF, alloc.allocate(dispatch_width / 8), > + inst->dst.type, dispatch_width); > + } > fs_reg low = inst->dst; > fs_reg high(GRF, alloc.allocate(dispatch_width / 8), > inst->dst.type, dispatch_width); > @@ -3655,6 +3659,13 @@ fs_visitor::lower_integer_multiplication() > low.stride = 2; > > insert(ADD(dst, low, high)); > + > + if (inst->conditional_mod) { > +fs_reg null(retype(brw_null_reg(), inst->dst.type)); > +fs_inst *mov = MOV(null, inst->dst); > +mov->conditional_mod = inst->conditional_mod; > +insert(mov); > + } >} > #undef insert > > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 07/22] i965/gen9: Set horizontal alignment for the miptree
On Fri, Apr 17, 2015 at 04:51:28PM -0700, Anuj Phogat wrote: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/brw_tex_layout.c | 80 > ++ > 1 file changed, 80 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c > b/src/mesa/drivers/dri/i965/brw_tex_layout.c > index 19ff5b8..d7cfe08 100644 > --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c > +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c > @@ -40,6 +40,80 @@ > #define FILE_DEBUG_FLAG DEBUG_MIPTREE > > static unsigned int > +tr_mode_horizontal_texture_alignment(struct brw_context *brw, > + struct intel_mipmap_tree *mt) > +{ > + const unsigned *align_yf, *align_ys; > + /* Bits per pixel/element. */ > + const unsigned bpp = _mesa_get_format_bytes(mt->format) * 8; > + > + /* Horizontal alignment tables for TRMODE_{YF,YS}. Value in below > +* tables specifies the horizontal alignment requirement in elements > +* for the surface. An element is defined as a pixel in uncompressed > +* surface formats, and as a compression block in compressed surface > +* formats. For MSFMT_DEPTH_STENCIL type multisampled surfaces, an > +* element is a sample. > +*/ > + const unsigned align_1d_yf[] = {4096, 2048, 1024, 512, 256}; > + const unsigned align_1d_ys[] = {65536, 32768, 16384, 8192, 4096}; > + const unsigned align_2d_yf[] = {64, 64, 32, 32, 16}; > + const unsigned align_2d_ys[] = {256, 256, 128, 128, 64}; > + const unsigned align_3d_yf[] = {16, 8, 8, 8, 4}; > + const unsigned align_3d_ys[] = {64, 32, 32, 32, 16}; > + int i = 0; > + > + /* Alignment computations below assume bpp >= 8 and a power of 2. */ > + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); > + > + switch(mt->target) { > + case GL_TEXTURE_1D: > + case GL_TEXTURE_1D_ARRAY: > + align_yf = align_1d_yf; > + align_ys = align_1d_ys; > + break; > + case GL_TEXTURE_2D: > + case GL_TEXTURE_RECTANGLE: > + case GL_TEXTURE_2D_ARRAY: > + case GL_TEXTURE_CUBE_MAP: > + case GL_TEXTURE_CUBE_MAP_ARRAY: > + case GL_TEXTURE_2D_MULTISAMPLE: > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > + align_yf = align_2d_yf; > + align_ys = align_2d_ys; > + break; > + case GL_TEXTURE_3D: > + align_yf = align_3d_yf; > + align_ys = align_3d_ys; > + break; > + default: > + unreachable("not reached"); > + } > + > + /* Compute array index. */ > + while (bpp >> (i + 4)) > + i++; We've discussed this and the pow2 stuff above already. int ret_align = mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] : align_ys[i]; > + > + if (mt->num_samples > 1) { Might be worth making a note here that this isn't for depthstencil surfaces (your code is right, just pointing this out) > + const unsigned ms_align_div[] = {2, 2, 4, 4}; > + int j = 0; > + /* num_samples must be power of 2. */ > + assert((mt->num_samples & (mt->num_samples -1)) == 0); > + > + /* Compute array index. */ > + while (mt->num_samples >> (j + 2)) > +j++; > + > + return mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? > +align_yf[i] / ms_align_div[j] : > +align_ys[i] / ms_align_div[j]; > + } > + > + return (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] > + : align_ys[i]); > +} > + > + I'd be in favor of a switch here, but it's up to you. If you follow what I have above, it's just this: switch (mt->num_samples) { case 2: case 4: divisor = 2; break; case 8: case 16: divisor = 4; break; default: divisor = 1; break; } return ret_align / divisor; > +static unsigned int > intel_horizontal_texture_alignment_unit(struct brw_context *brw, > struct intel_mipmap_tree *mt) > { > @@ -79,6 +153,12 @@ intel_horizontal_texture_alignment_unit(struct > brw_context *brw, > if (mt->format == MESA_FORMAT_S_UINT8) >return 8; > > + if (brw->gen >= 9 && mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) { > + uint32_t align = tr_mode_horizontal_texture_alignment(brw, mt); > + /* XY_FAST_COPY_BLT doesn't support horizontal alignment < 32. */ > + return align < 32 ? 32 : align; > + } > + > if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) >return 8; > Seems a bit weird to me to split up HALIGN/VALIGN into separate patches, but it's up to you. Reviewed-by: Ben Widawsky ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 08/22] i965/gen9: Use HALIGN_16 if MCS is enabled
On Fri, Apr 17, 2015 at 04:51:29PM -0700, Anuj Phogat wrote: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/brw_tex_layout.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c > b/src/mesa/drivers/dri/i965/brw_tex_layout.c > index d7cfe08..20dcc21 100644 > --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c > +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c > @@ -162,7 +162,8 @@ intel_horizontal_texture_alignment_unit(struct > brw_context *brw, > if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) >return 8; > > - if (brw->gen == 8 && mt->mcs_mt && mt->num_samples <= 1) > + if ((brw->gen == 8 && mt->mcs_mt && mt->num_samples <= 1) || > + (brw->gen >= 9 && mt->mcs_mt)) >return 16; > > return 4; This is incorrect as we've discussed. I have a branch to fix it. I'd suggest dropping it from your series. References: http://cgit.freedesktop.org/~bwidawsk/mesa/log/?h=halign-fix ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/15] egl: add eglGetSyncAttrib
On Wed 13 May 2015, Marek Olšák wrote: > From: Marek Olšák > > --- > src/egl/main/eglapi.c | 14 +- > src/egl/main/eglapi.h | 2 +- > src/egl/main/eglsync.c | 2 +- > src/egl/main/eglsync.h | 2 +- > 4 files changed, 16 insertions(+), 4 deletions(-) > I see two issues below. > diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c > index 544f7e4..6457798 100644 > --- a/src/egl/main/eglapi.c > +++ b/src/egl/main/eglapi.c > @@ -1558,6 +1559,17 @@ eglGetSyncAttribKHR(EGLDisplay dpy, EGLSync sync, > EGLint attribute, EGLint *valu > } > > > +EGLBoolean EGLAPIENTRY > +eglGetSyncAttribKHR(EGLDisplay dpy, EGLSync sync, EGLint attribute, EGLint > *value) > +{ > + EGLAttrib attrib = *value; > + EGLBoolean result = eglGetSyncAttrib(dpy, sync, attribute, &attrib); > + > + *value = attrib; > + return result; > +} This change violates the EGL_KHR_fence_sync spec, which says this regarding eglGetSyncAttribKHR: If any error occurs, <*value> is not modified. I think it's a good idea to add this quote to the function body so that future devs don't accidentally make the same mistake. > diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h > index 44f589d..fdd3b05 100644 > --- a/src/egl/main/eglapi.h > +++ b/src/egl/main/eglapi.h > @@ -91,7 +91,7 @@ typedef EGLBoolean (*DestroySyncKHR_t)(_EGLDriver *drv, > _EGLDisplay *dpy, _EGLSy > typedef EGLint (*ClientWaitSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, > _EGLSync *sync, EGLint flags, EGLTime timeout); > typedef EGLint (*WaitSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync > *sync); > typedef EGLBoolean (*SignalSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, > _EGLSync *sync, EGLenum mode); > -typedef EGLBoolean (*GetSyncAttribKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, > _EGLSync *sync, EGLint attribute, EGLint *value); > +typedef EGLBoolean (*GetSyncAttribKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, > _EGLSync *sync, EGLint attribute, EGLAttrib *value); I think changing this change is potentially dangerous, because post-patch GetSyncAttribKHR_t has the signature of eglGetSyncAttrib and *not* eglGetSyncAttribKHR. The code will compile, but this discrepancy looks like a honeypot for accidental function misuse. And... > #ifdef EGL_NOK_swap_region > diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c > index 205cdc0..a09d991 100644 > --- a/src/egl/main/eglsync.c > +++ b/src/egl/main/eglsync.c > @@ -142,7 +142,7 @@ _eglInitSync(_EGLSync *sync, _EGLDisplay *dpy, EGLenum > type, > > EGLBoolean > _eglGetSyncAttribKHR(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync, > - EGLint attribute, EGLint *value) > + EGLint attribute, EGLAttrib *value) I have the same issue here. Let's avoid introducing confusing discrepancies between function names and their signatures. Just name this one and the typedef to '_eglGetSyncAttrib' and 'GetSyncAttrib_t'. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 07/22] i965/gen9: Set horizontal alignment for the miptree
On Wed, May 27, 2015 at 11:56:24AM -0700, Ben Widawsky wrote: > On Fri, Apr 17, 2015 at 04:51:28PM -0700, Anuj Phogat wrote: > > Signed-off-by: Anuj Phogat > > --- > > src/mesa/drivers/dri/i965/brw_tex_layout.c | 80 > > ++ > > 1 file changed, 80 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c > > b/src/mesa/drivers/dri/i965/brw_tex_layout.c > > index 19ff5b8..d7cfe08 100644 > > --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c > > +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c > > @@ -40,6 +40,80 @@ > > #define FILE_DEBUG_FLAG DEBUG_MIPTREE > > > > static unsigned int > > +tr_mode_horizontal_texture_alignment(struct brw_context *brw, > > + struct intel_mipmap_tree *mt) > > +{ > > + const unsigned *align_yf, *align_ys; > > + /* Bits per pixel/element. */ > > + const unsigned bpp = _mesa_get_format_bytes(mt->format) * 8; > > + > > + /* Horizontal alignment tables for TRMODE_{YF,YS}. Value in below > > +* tables specifies the horizontal alignment requirement in elements > > +* for the surface. An element is defined as a pixel in uncompressed > > +* surface formats, and as a compression block in compressed surface > > +* formats. For MSFMT_DEPTH_STENCIL type multisampled surfaces, an > > +* element is a sample. > > +*/ > > + const unsigned align_1d_yf[] = {4096, 2048, 1024, 512, 256}; > > + const unsigned align_1d_ys[] = {65536, 32768, 16384, 8192, 4096}; > > > + const unsigned align_2d_yf[] = {64, 64, 32, 32, 16}; > > + const unsigned align_2d_ys[] = {256, 256, 128, 128, 64}; > > > + const unsigned align_3d_yf[] = {16, 8, 8, 8, 4}; > > + const unsigned align_3d_ys[] = {64, 32, 32, 32, 16}; > > + int i = 0; > > + > > + /* Alignment computations below assume bpp >= 8 and a power of 2. */ > > + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); > > + > > + switch(mt->target) { > > + case GL_TEXTURE_1D: > > + case GL_TEXTURE_1D_ARRAY: > > + align_yf = align_1d_yf; > > + align_ys = align_1d_ys; > > + break; > > + case GL_TEXTURE_2D: > > + case GL_TEXTURE_RECTANGLE: > > + case GL_TEXTURE_2D_ARRAY: > > + case GL_TEXTURE_CUBE_MAP: > > + case GL_TEXTURE_CUBE_MAP_ARRAY: > > + case GL_TEXTURE_2D_MULTISAMPLE: > > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > > + align_yf = align_2d_yf; > > + align_ys = align_2d_ys; > > + break; > > + case GL_TEXTURE_3D: > > + align_yf = align_3d_yf; > > + align_ys = align_3d_ys; > > + break; > > + default: > > + unreachable("not reached"); > > + } > > + > > + /* Compute array index. */ > > + while (bpp >> (i + 4)) > > + i++; > > We've discussed this and the pow2 stuff above already. > > int ret_align = mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? > align_yf[i] : align_ys[i]; > > > + > > + if (mt->num_samples > 1) { > > Might be worth making a note here that this isn't for depthstencil surfaces > (your code is right, just pointing this out) > > > + const unsigned ms_align_div[] = {2, 2, 4, 4}; > > + int j = 0; > > + /* num_samples must be power of 2. */ > > + assert((mt->num_samples & (mt->num_samples -1)) == 0); > > + > > + /* Compute array index. */ > > + while (mt->num_samples >> (j + 2)) > > +j++; > > + > > + return mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? > > +align_yf[i] / ms_align_div[j] : > > +align_ys[i] / ms_align_div[j]; > > + } > > + > > + return (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] > > + : align_ys[i]); > > +} > > + > > + > > I'd be in favor of a switch here, but it's up to you. If you follow what I > have > above, it's just this: > > switch (mt->num_samples) { > case 2: > case 4: > divisor = 2; > break; > case 8: > case 16: > divisor = 4; > break; > default: > divisor = 1; > break; > } > > return ret_align / divisor; > > > +static unsigned int > > intel_horizontal_texture_alignment_unit(struct brw_context *brw, > > struct intel_mipmap_tree *mt) > > { > > @@ -79,6 +153,12 @@ intel_horizontal_texture_alignment_unit(struct > > brw_context *brw, > > if (mt->format == MESA_FORMAT_S_UINT8) > >return 8; > > > > + if (brw->gen >= 9 && mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) { > > + uint32_t align = tr_mode_horizontal_texture_alignment(brw, mt); > > + /* XY_FAST_COPY_BLT doesn't support horizontal alignment < 32. */ > > + return align < 32 ? 32 : align; > > + } Just realized I didn't actually see this restriction. Could you point me to it offline? > > + > > if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) > >return 8; > > > > > Seems a bit weird to me to split up HALIGN/VALIGN into separate patches, but > i
Re: [Mesa-dev] [PATCH V2 09/22] i965/gen9: Set vertical alignment for the miptree
On Fri, Apr 17, 2015 at 04:51:30PM -0700, Anuj Phogat wrote: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/brw_tex_layout.c | 74 > ++ > 1 file changed, 74 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c > b/src/mesa/drivers/dri/i965/brw_tex_layout.c > index 20dcc21..9342101 100644 > --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c > +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c > @@ -170,6 +170,69 @@ intel_horizontal_texture_alignment_unit(struct > brw_context *brw, > } > > static unsigned int > +tr_mode_vertical_texture_alignment(struct brw_context *brw, > + struct intel_mipmap_tree *mt) > +{ > + const unsigned *align_yf, *align_ys; > + /* Bits per pixel/element. */ > + const unsigned bpp = _mesa_get_format_bytes(mt->format) * 8; > + > + /* Vertical alignment tables for TRMODE_YF and TRMODE_YS. */ > + const unsigned align_2d_yf[] = {64, 32, 32, 16, 16}; > + const unsigned align_2d_ys[] = {256, 128, 128, 64, 64}; > + const unsigned align_3d_yf[] = {16, 16, 16, 8, 8}; > + const unsigned align_3d_ys[] = {32, 32, 32, 16, 16}; > + int i = 0; > + > + assert(brw->gen >= 9 && > + mt->target != GL_TEXTURE_1D && > + mt->target != GL_TEXTURE_1D_ARRAY); > + > + /* Alignment computations below assume bpp >= 8 and a power of 2. */ > + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); > + > + switch(mt->target) { > + case GL_TEXTURE_2D: > + case GL_TEXTURE_RECTANGLE: > + case GL_TEXTURE_2D_ARRAY: > + case GL_TEXTURE_CUBE_MAP: > + case GL_TEXTURE_CUBE_MAP_ARRAY: > + case GL_TEXTURE_2D_MULTISAMPLE: > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > + align_yf = align_2d_yf; > + align_ys = align_2d_ys; > + break; > + case GL_TEXTURE_3D: > + align_yf = align_3d_yf; > + align_ys = align_3d_ys; > + break; > + default: > + unreachable("not reached"); > + } > + > + /* Compute array index. */ > + while (bpp >> (i + 4)) > + i++; > + > + if (mt->num_samples > 1) { > + const unsigned ms_align_div[] = {1, 2, 2, 4}; > + int j = 0; > + /* num_samples must be power of 2. */ > + assert((mt->num_samples & (mt->num_samples -1)) == 0); > + > + /* Compute array index. */ > + while (mt->num_samples >> (j + 2)) > + j++; > + return mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? > + align_yf[i] / ms_align_div[j] : > + align_ys[i] / ms_align_div[j]; > + } > + > + return (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] > + : align_ys[i]); > +} > + > +static unsigned int I think all the comments I had for horizontal alignments apply here too. > intel_vertical_texture_alignment_unit(struct brw_context *brw, >struct intel_mipmap_tree *mt) > { > @@ -202,6 +265,17 @@ intel_vertical_texture_alignment_unit(struct brw_context > *brw, > if (mt->format == MESA_FORMAT_S_UINT8) >return brw->gen >= 7 ? 8 : 4; > > + /* Vertical texture alignment is not relevant for 1D texture targets. */ > + if (brw->gen >= 9 && (mt->target == GL_TEXTURE_1D || > + mt->target == GL_TEXTURE_1D_ARRAY)) > + return 1; I think this hunk can go away with Neil's patches since then. > + > + if (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) { > + uint32_t align = tr_mode_vertical_texture_alignment(brw, mt); > + /* XY_FAST_COPY_BLT doesn't support vertical alignment < 64 */ > + return align < 64 ? 64 : align; > + } > + Like the HALIGN, I don't see this restriction, can you tell me where it is. > /* Broadwell only supports VALIGN of 4, 8, and 16. The BSpec says 4 > * should always be used, except for stencil buffers, which should be 8. > */ Reviewed-by: Ben Widawsky ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Properly handle explicit depth in SIMD16 with dual-source blend
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/15] egl: add eglCreateImage
On Fri 15 May 2015, Emil Velikov wrote: > On 12/05/15 22:54, Marek Olšák wrote: > > From: Marek Olšák > > > > --- > > src/egl/main/eglapi.c | 38 ++ > > 1 file changed, 38 insertions(+) > > > > diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c > > index 6457798..34a113b 100644 > > --- a/src/egl/main/eglapi.c > > +++ b/src/egl/main/eglapi.c > > @@ -251,6 +251,30 @@ _eglUnlockDisplay(_EGLDisplay *dpy) > > } > > > > > > +static EGLint * > > +_eglConvertAttribsToInt(const EGLAttrib *attr_list) > > +{ > > + EGLint *int_attribs = NULL; > > + > > + /* Convert attributes from EGLAttrib[] to EGLint[] */ > > + if (attr_list) { > > + int i, size = 0; > > + > > + while (attr_list[size] != EGL_NONE) > > + size += 2; > > + > > + if (size) { > > + size += 1; /* add space for EGL_NONE */ > > + int_attribs = malloc(size * sizeof(int_attribs[0])); > > + > > + for (i = 0; i < size; i++) > > +int_attribs[i] = attr_list[i]; > In the unlikely event that malloc fails, it'll be nice to not crash. NAK. There is a stack overflow vulnerability here, even when malloc succeeds. An attacker can pass a very large but valid `EGLint *attrib_list` into an EGL entry point, forcing the size calculation given to malloc to overflow to a small positive integer. Then _eglConvertAttribsToInt will blithely copy a portion (perhaps most) of the attacker's attrib list onto the stack! To prevent the stack overflow, _eglConvertAttribsToInt should use calloc() and abort if allocation fails. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Silence warning in 3-src type-setting.
--- src/mesa/drivers/dri/i965/brw_eu_emit.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index e78d0be..a1d11f3 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -914,6 +914,8 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct brw_reg dest, brw_inst_set_3src_src_type(devinfo, inst, BRW_3SRC_TYPE_UD); brw_inst_set_3src_dst_type(devinfo, inst, BRW_3SRC_TYPE_UD); break; + default: + unreachable("not reached"); } } -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix binding validation for interface blocks
Matt Turner wrote > On Wed, May 27, 2015 at 3:53 AM, Timothy Arceri wrote: > > Cc: Kenneth Graunke > > --- > > > > It looks like this may never have actually worked, at least there > > is no way it could have for inteface block arrays :) > > > > No piglit regressions. > > > > src/glsl/ast_to_hir.cpp | 22 ++ > > 1 file changed, 14 insertions(+), 8 deletions(-) > > > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > > index 0ab664e..861bbfe 100644 > > --- a/src/glsl/ast_to_hir.cpp > > +++ b/src/glsl/ast_to_hir.cpp > > @@ -2041,10 +2041,10 @@ validate_matrix_layout_for_type(struct > > _mesa_glsl_parse_state *state, > > static bool > > validate_binding_qualifier(struct _mesa_glsl_parse_state *state, > > YYLTYPE *loc, > > - ir_variable *var, > > + const glsl_type *type, > > const ast_type_qualifier *qual) > > { > > - if (var->data.mode != ir_var_uniform) { > > + if (!qual->flags.q.uniform) { > >_mesa_glsl_error(loc, state, > > "the \"binding\" qualifier only applies to > > uniforms"); > >return false; > > @@ -2056,10 +2056,11 @@ validate_binding_qualifier(struct > > _mesa_glsl_parse_state *state, > > } > > > > const struct gl_context *const ctx = state->ctx; > > - unsigned elements = var->type->is_array() ? var->type->length : 1; > > + unsigned elements = type->is_array() ? type->length : 1; > > unsigned max_index = qual->binding + elements - 1; > > + const glsl_type *base_type = type->without_array(); > > > > - if (var->type->is_interface()) { > > + if (base_type->is_interface()) { > >/* UBOs. From page 60 of the GLSL 4.20 specification: > > * "If the binding point for any uniform block instance is less than > > zero, > > * or greater than or equal to the implementation-dependent maximum > > @@ -2077,8 +2078,7 @@ validate_binding_qualifier(struct > > _mesa_glsl_parse_state *state, > >ctx->Const.MaxUniformBufferBindings); > > return false; > >} > > - } else if (var->type->is_sampler() || > > - (var->type->is_array() && > > var->type->fields.array->is_sampler())) { > > + } else if (base_type->is_sampler()) { > >/* Samplers. From page 63 of the GLSL 4.20 specification: > > * "If the binding is less than zero, or greater than or equal to the > > * implementation-dependent maximum supported number of units, a > > @@ -2095,7 +2095,7 @@ validate_binding_qualifier(struct > > _mesa_glsl_parse_state *state, > > > > return false; > >} > > - } else if (var->type->contains_atomic()) { > > + } else if (base_type->contains_atomic()) { > >assert(ctx->Const.MaxAtomicBufferBindings <= > > MAX_COMBINED_ATOMIC_BUFFERS); > >if (unsigned(qual->binding) >= ctx->Const.MaxAtomicBufferBindings) { > > _mesa_glsl_error(loc, state, "layout(binding = %d) exceeds the " > > @@ -2651,7 +2651,7 @@ apply_type_qualifier_to_variable(const struct > > ast_type_qualifier *qual, > > } > > > > if (qual->flags.q.explicit_binding && > > - validate_binding_qualifier(state, loc, var, qual)) { > > + validate_binding_qualifier(state, loc, var->type, qual)) { > >var->data.explicit_binding = true; > >var->data.binding = qual->binding; > > } > > @@ -5658,6 +5658,8 @@ ast_interface_block::hir(exec_list *instructions, > > num_variables, > > packing, > > this->block_name); > > + if (this->layout.flags.q.explicit_binding) > > + validate_binding_qualifier(state, &loc, block_type, &this->layout); > > > > if (!state->symbols->add_interface(block_type->name, block_type, > > var_mode)) { > >YYLTYPE loc = this->get_location(); > > @@ -5744,6 +5746,10 @@ ast_interface_block::hir(exec_list *instructions, > > const glsl_type *block_array_type = > > process_array_type(&loc, block_type, this->array_specifier, > > state); > > > > + if(this->layout.flags.q.explicit_binding) > > Space after if The space was intentional to separate validation code from from code that creates the IR. If you still think it should go I can remove it but I think it helps with code readability.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix binding validation for interface blocks
On 28 May 2015 at 06:51, Timothy Arceri wrote: > Matt Turner wrote > >> On Wed, May 27, 2015 at 3:53 AM, Timothy Arceri >> wrote: >> > Cc: Kenneth Graunke >> > --- >> > >> > It looks like this may never have actually worked, at least there >> > is no way it could have for inteface block arrays :) >> > >> > No piglit regressions. >> > >> > src/glsl/ast_to_hir.cpp | 22 ++ >> > 1 file changed, 14 insertions(+), 8 deletions(-) >> > >> > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp >> > index 0ab664e..861bbfe 100644 >> > --- a/src/glsl/ast_to_hir.cpp >> > +++ b/src/glsl/ast_to_hir.cpp >> > @@ -2041,10 +2041,10 @@ validate_matrix_layout_for_type(struct >> > _mesa_glsl_parse_state *state, >> > static bool >> > validate_binding_qualifier(struct _mesa_glsl_parse_state *state, >> > YYLTYPE *loc, >> > - ir_variable *var, >> > + const glsl_type *type, >> > const ast_type_qualifier *qual) >> > { >> > - if (var->data.mode != ir_var_uniform) { >> > + if (!qual->flags.q.uniform) { >> >_mesa_glsl_error(loc, state, >> > "the \"binding\" qualifier only applies to >> > uniforms"); >> >return false; >> > @@ -2056,10 +2056,11 @@ validate_binding_qualifier(struct >> > _mesa_glsl_parse_state *state, >> > } >> > >> > const struct gl_context *const ctx = state->ctx; >> > - unsigned elements = var->type->is_array() ? var->type->length : 1; >> > + unsigned elements = type->is_array() ? type->length : 1; >> > unsigned max_index = qual->binding + elements - 1; >> > + const glsl_type *base_type = type->without_array(); >> > >> > - if (var->type->is_interface()) { >> > + if (base_type->is_interface()) { >> >/* UBOs. From page 60 of the GLSL 4.20 specification: >> > * "If the binding point for any uniform block instance is less >> > than zero, >> > * or greater than or equal to the implementation-dependent >> > maximum >> > @@ -2077,8 +2078,7 @@ validate_binding_qualifier(struct >> > _mesa_glsl_parse_state *state, >> >ctx->Const.MaxUniformBufferBindings); >> > return false; >> >} >> > - } else if (var->type->is_sampler() || >> > - (var->type->is_array() && >> > var->type->fields.array->is_sampler())) { >> > + } else if (base_type->is_sampler()) { >> >/* Samplers. From page 63 of the GLSL 4.20 specification: >> > * "If the binding is less than zero, or greater than or equal to >> > the >> > * implementation-dependent maximum supported number of units, a >> > @@ -2095,7 +2095,7 @@ validate_binding_qualifier(struct >> > _mesa_glsl_parse_state *state, >> > >> > return false; >> >} >> > - } else if (var->type->contains_atomic()) { >> > + } else if (base_type->contains_atomic()) { >> >assert(ctx->Const.MaxAtomicBufferBindings <= >> > MAX_COMBINED_ATOMIC_BUFFERS); >> >if (unsigned(qual->binding) >= >> > ctx->Const.MaxAtomicBufferBindings) { >> > _mesa_glsl_error(loc, state, "layout(binding = %d) exceeds the >> > " >> > @@ -2651,7 +2651,7 @@ apply_type_qualifier_to_variable(const struct >> > ast_type_qualifier *qual, >> > } >> > >> > if (qual->flags.q.explicit_binding && >> > - validate_binding_qualifier(state, loc, var, qual)) { >> > + validate_binding_qualifier(state, loc, var->type, qual)) { >> >var->data.explicit_binding = true; >> >var->data.binding = qual->binding; >> > } >> > @@ -5658,6 +5658,8 @@ ast_interface_block::hir(exec_list *instructions, >> > num_variables, >> > packing, >> > this->block_name); >> > + if (this->layout.flags.q.explicit_binding) >> > + validate_binding_qualifier(state, &loc, block_type, >> > &this->layout); >> > >> > if (!state->symbols->add_interface(block_type->name, block_type, >> > var_mode)) { >> >YYLTYPE loc = this->get_location(); >> > @@ -5744,6 +5746,10 @@ ast_interface_block::hir(exec_list *instructions, >> > const glsl_type *block_array_type = >> > process_array_type(&loc, block_type, this->array_specifier, >> > state); >> > >> > + if(this->layout.flags.q.explicit_binding) >> >> Space after if > > The space was intentional to separate validation code from from code that > creates the IR. If you still think it should go I can remove it but I think > it helps with code readability. he meant the missing space between if and ( Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix binding validation for interface blocks
On Thu, 2015-05-28 at 07:07 +1000, Dave Airlie wrote: > On 28 May 2015 at 06:51, Timothy Arceri wrote: > > Matt Turner wrote > > > >> On Wed, May 27, 2015 at 3:53 AM, Timothy Arceri > >> wrote: > >> > Cc: Kenneth Graunke > >> > --- > >> > > >> > It looks like this may never have actually worked, at least there > >> > is no way it could have for inteface block arrays :) > >> > > >> > No piglit regressions. > >> > > >> > src/glsl/ast_to_hir.cpp | 22 ++ > >> > 1 file changed, 14 insertions(+), 8 deletions(-) > >> > > >> > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > >> > index 0ab664e..861bbfe 100644 > >> > --- a/src/glsl/ast_to_hir.cpp > >> > +++ b/src/glsl/ast_to_hir.cpp > >> > @@ -2041,10 +2041,10 @@ validate_matrix_layout_for_type(struct > >> > _mesa_glsl_parse_state *state, > >> > static bool > >> > validate_binding_qualifier(struct _mesa_glsl_parse_state *state, > >> > YYLTYPE *loc, > >> > - ir_variable *var, > >> > + const glsl_type *type, > >> > const ast_type_qualifier *qual) > >> > { > >> > - if (var->data.mode != ir_var_uniform) { > >> > + if (!qual->flags.q.uniform) { > >> >_mesa_glsl_error(loc, state, > >> > "the \"binding\" qualifier only applies to > >> > uniforms"); > >> >return false; > >> > @@ -2056,10 +2056,11 @@ validate_binding_qualifier(struct > >> > _mesa_glsl_parse_state *state, > >> > } > >> > > >> > const struct gl_context *const ctx = state->ctx; > >> > - unsigned elements = var->type->is_array() ? var->type->length : 1; > >> > + unsigned elements = type->is_array() ? type->length : 1; > >> > unsigned max_index = qual->binding + elements - 1; > >> > + const glsl_type *base_type = type->without_array(); > >> > > >> > - if (var->type->is_interface()) { > >> > + if (base_type->is_interface()) { > >> >/* UBOs. From page 60 of the GLSL 4.20 specification: > >> > * "If the binding point for any uniform block instance is less > >> > than zero, > >> > * or greater than or equal to the implementation-dependent > >> > maximum > >> > @@ -2077,8 +2078,7 @@ validate_binding_qualifier(struct > >> > _mesa_glsl_parse_state *state, > >> >ctx->Const.MaxUniformBufferBindings); > >> > return false; > >> >} > >> > - } else if (var->type->is_sampler() || > >> > - (var->type->is_array() && > >> > var->type->fields.array->is_sampler())) { > >> > + } else if (base_type->is_sampler()) { > >> >/* Samplers. From page 63 of the GLSL 4.20 specification: > >> > * "If the binding is less than zero, or greater than or equal to > >> > the > >> > * implementation-dependent maximum supported number of units, a > >> > @@ -2095,7 +2095,7 @@ validate_binding_qualifier(struct > >> > _mesa_glsl_parse_state *state, > >> > > >> > return false; > >> >} > >> > - } else if (var->type->contains_atomic()) { > >> > + } else if (base_type->contains_atomic()) { > >> >assert(ctx->Const.MaxAtomicBufferBindings <= > >> > MAX_COMBINED_ATOMIC_BUFFERS); > >> >if (unsigned(qual->binding) >= > >> > ctx->Const.MaxAtomicBufferBindings) { > >> > _mesa_glsl_error(loc, state, "layout(binding = %d) exceeds the > >> > " > >> > @@ -2651,7 +2651,7 @@ apply_type_qualifier_to_variable(const struct > >> > ast_type_qualifier *qual, > >> > } > >> > > >> > if (qual->flags.q.explicit_binding && > >> > - validate_binding_qualifier(state, loc, var, qual)) { > >> > + validate_binding_qualifier(state, loc, var->type, qual)) { > >> >var->data.explicit_binding = true; > >> >var->data.binding = qual->binding; > >> > } > >> > @@ -5658,6 +5658,8 @@ ast_interface_block::hir(exec_list *instructions, > >> > num_variables, > >> > packing, > >> > this->block_name); > >> > + if (this->layout.flags.q.explicit_binding) > >> > + validate_binding_qualifier(state, &loc, block_type, > >> > &this->layout); > >> > > >> > if (!state->symbols->add_interface(block_type->name, block_type, > >> > var_mode)) { > >> >YYLTYPE loc = this->get_location(); > >> > @@ -5744,6 +5746,10 @@ ast_interface_block::hir(exec_list *instructions, > >> > const glsl_type *block_array_type = > >> > process_array_type(&loc, block_type, this->array_specifier, > >> > state); > >> > > >> > + if(this->layout.flags.q.explicit_binding) > >> > >> Space after if > > > > The space was intentional to separate validation code from from code that > > creates the IR. If you still think it should go I can remove it but I think > > it helps with code readability. > > he meant the missing space between if and ( > oh ri
Re: [Mesa-dev] [PATCH V2 17/22] i965/gen9: Add XY_FAST_COPY_BLT support to intelEmitCopyBlit()
On Wed, May 27, 2015 at 10:42 AM, Ben Widawsky wrote: > On Thu, May 21, 2015 at 05:46:48PM -0700, Anuj Phogat wrote: >> On Thu, May 21, 2015 at 12:26 PM, Ben Widawsky wrote: > > SNIP > >> > As I mentioned above wrt the CL alignment, I think it's safe to do for both >> > modes. Something like: >> > >> > /* Check both are at least cacheline aligned */ >> > if (brw->gen >= 8 && (src_offset | dst_offset) & 63) >> > return false; >> > >> We do this check for SRC_COPY_BLT in alignemnet_valid(). I added >> the restrictions for FAST_COPY_BLT in fast_copy_blit_error_check(). >> Doing this check for both satisy all alignment restrictions of both modes >> except one in fast copy blit: when src is untiled and dst is tiled, dst >> should be 64 byte aligned and src should be 16 bytes aligned. See >> fast_copy_blit_error_check(). >> But we're adding a restriction for it to be 64 byte by the suggested >> change. Is there any advantage of the suggested check for both? >> > > Oh hmm, I hadn't noticed a 16byte alignment requirement anywhere. I thought > 64b > was the minimum - if it's not ignore me. > I noticed 16 byte alignment requirement has been removed in an update. I'll use your suggestion. >> >> if ((dst_y_tiled || src_y_tiled) && brw->gen < 6) >> >>return false; >> >> >> >> assert(!dst_y_tiled || (dst_pitch % 128) == 0); >> >> assert(!src_y_tiled || (src_pitch % 128) == 0); >> >> >> >> + if (use_fast_copy_blit == false && >> >> + (src_tr_mode != INTEL_MIPTREE_TRMODE_NONE || >> >> +dst_tr_mode != INTEL_MIPTREE_TRMODE_NONE)) >> >> + return false; >> >> + >> > >> > It seems like we should make it an error to use this improperly: >> > >> > /* It's an error for a caller to try to do a regular blit with a yf/ys >> > region */ >> > assert(use_fast_copy_blit && >> >(src_tr_mode != INTEL_MIPTREE_TRMODE_NONE) && >> >(dst_tr_mode != INTEL_MIPTREE_TRMODE_NONE)) >> > >> Agree. >> >> /* do space check before going any further */ >> >> do { >> >> aper_array[0] = brw->batch.bo; >> >> @@ -334,13 +498,6 @@ intelEmitCopyBlit(struct brw_context *brw, >> >> src_buffer, src_pitch, src_offset, src_x, src_y, >> >> dst_buffer, dst_pitch, dst_offset, dst_x, dst_y, w, h); >> >> >> >> - /* Blit pitch must be dword-aligned. Otherwise, the hardware appears >> >> to drop >> >> -* the low bits. Offsets must be naturally aligned. >> >> -*/ >> >> - if (src_pitch % 4 != 0 || src_offset % cpp != 0 || >> >> - dst_pitch % 4 != 0 || dst_offset % cpp != 0) >> >> - return false; >> >> - >> >> /* For big formats (such as floating point), do the copy using 16 or >> >> 32bpp >> >> * and multiply the coordinates. >> >> */ >> >> @@ -359,27 +516,76 @@ intelEmitCopyBlit(struct brw_context *brw, >> >>} >> >> } >> >> >> >> - BR13 = br13_for_cpp(cpp) | translate_raster_op(logic_op) << 16; >> >> + if (use_fast_copy_blit && >> >> + fast_copy_blit_error_check(src_offset, src_pitch, src_tiling, >> >> + dst_offset, dst_pitch, dst_tiling, >> >> cpp)) { >> >> + BR13 = br13_for_cpp(cpp); >> >> >> > >> > If you follow about you can remove the error check here as we know it's >> > safe to >> > proceed. >> > >> Right. As i understand you mean to move the fast_copy_blit_error_check() >> inside >> can_fast_copy_blit() helper ? >> >> >> - switch (cpp) { >> >> - case 1: >> >> - case 2: >> >> - CMD = XY_SRC_COPY_BLT_CMD; >> >> - break; >> >> - case 4: >> >> - CMD = XY_SRC_COPY_BLT_CMD | XY_BLT_WRITE_ALPHA | XY_BLT_WRITE_RGB; >> >> - break; >> >> - default: >> >> - return false; >> >> - } >> >> + if (src_tr_mode == INTEL_MIPTREE_TRMODE_YF) >> >> + BR13 |= XY_SRC_TRMODE_YF; >> >> >> >> - if (dst_tiling != I915_TILING_NONE) { >> >> - CMD |= XY_DST_TILED; >> >> - dst_pitch /= 4; >> >> - } >> >> - if (src_tiling != I915_TILING_NONE) { >> >> - CMD |= XY_SRC_TILED; >> >> - src_pitch /= 4; >> >> + if (dst_tr_mode == INTEL_MIPTREE_TRMODE_YF) >> >> + BR13 |= XY_DST_TRMODE_YF; >> >> + >> >> + CMD = XY_FAST_COPY_BLT_CMD; >> >> + >> >> + if (dst_tiling != I915_TILING_NONE) { >> >> + SET_TILING_XY_FAST_COPY_BLT(dst_tiling, dst_tr_mode, XY_DST); >> >> + /* Pitch value should be specified as a number of Dwords. */ >> >> + dst_pitch /= 4; >> >> + } >> >> + if (src_tiling != I915_TILING_NONE) { >> >> + SET_TILING_XY_FAST_COPY_BLT(src_tiling, src_tr_mode, XY_SRC); >> >> + /* Pitch value should be specified as a number of Dwords. */ >> >> + src_pitch /= 4; >> >> + } >> >> + >> >> + CMD |= get_tr_horizontal_align(src_tr_mode, cpp, true /* is_src >> >> */); >> >> + CMD |= get_tr_vertical_align(src_tr_mode, cpp, true /* is_src */); >> >> + >> >> + CMD |= get_tr_horizontal_align(dst_tr_mode, cpp, false /* is_src >> >> */); >> >> +
Re: [Mesa-dev] [v4 PATCH 05/10] mesa: helper function for scissor box of gl_framebuffer
> Out of curiosity, what editor are you using? Usually setting the > indentation rules in your editor takes care of most of these problems. I am using emacs23 (under Ubuntu) and now for something amusing... > We have a .dir-locals.el file that should provide the correct settings > for Emacs. emacs23 rejected that .el file, switching to emacs24 solved the fiasco. Shudders. IF I had had a new emacs all the tabbing nits would have never appeared. Oh well. -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 07/22] i965/gen9: Set horizontal alignment for the miptree
On Wed, May 27, 2015 at 11:56 AM, Ben Widawsky wrote: > On Fri, Apr 17, 2015 at 04:51:28PM -0700, Anuj Phogat wrote: >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/brw_tex_layout.c | 80 >> ++ >> 1 file changed, 80 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> index 19ff5b8..d7cfe08 100644 >> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> @@ -40,6 +40,80 @@ >> #define FILE_DEBUG_FLAG DEBUG_MIPTREE >> >> static unsigned int >> +tr_mode_horizontal_texture_alignment(struct brw_context *brw, >> + struct intel_mipmap_tree *mt) >> +{ >> + const unsigned *align_yf, *align_ys; >> + /* Bits per pixel/element. */ >> + const unsigned bpp = _mesa_get_format_bytes(mt->format) * 8; >> + >> + /* Horizontal alignment tables for TRMODE_{YF,YS}. Value in below >> +* tables specifies the horizontal alignment requirement in elements >> +* for the surface. An element is defined as a pixel in uncompressed >> +* surface formats, and as a compression block in compressed surface >> +* formats. For MSFMT_DEPTH_STENCIL type multisampled surfaces, an >> +* element is a sample. >> +*/ >> + const unsigned align_1d_yf[] = {4096, 2048, 1024, 512, 256}; >> + const unsigned align_1d_ys[] = {65536, 32768, 16384, 8192, 4096}; > >> + const unsigned align_2d_yf[] = {64, 64, 32, 32, 16}; >> + const unsigned align_2d_ys[] = {256, 256, 128, 128, 64}; > >> + const unsigned align_3d_yf[] = {16, 8, 8, 8, 4}; >> + const unsigned align_3d_ys[] = {64, 32, 32, 32, 16}; >> + int i = 0; >> + >> + /* Alignment computations below assume bpp >= 8 and a power of 2. */ >> + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); >> + >> + switch(mt->target) { >> + case GL_TEXTURE_1D: >> + case GL_TEXTURE_1D_ARRAY: >> + align_yf = align_1d_yf; >> + align_ys = align_1d_ys; >> + break; >> + case GL_TEXTURE_2D: >> + case GL_TEXTURE_RECTANGLE: >> + case GL_TEXTURE_2D_ARRAY: >> + case GL_TEXTURE_CUBE_MAP: >> + case GL_TEXTURE_CUBE_MAP_ARRAY: >> + case GL_TEXTURE_2D_MULTISAMPLE: >> + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: >> + align_yf = align_2d_yf; >> + align_ys = align_2d_ys; >> + break; >> + case GL_TEXTURE_3D: >> + align_yf = align_3d_yf; >> + align_ys = align_3d_ys; >> + break; >> + default: >> + unreachable("not reached"); >> + } >> + >> + /* Compute array index. */ >> + while (bpp >> (i + 4)) >> + i++; > > We've discussed this and the pow2 stuff above already. > > int ret_align = mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? > align_yf[i] : align_ys[i]; > >> + >> + if (mt->num_samples > 1) { > > Might be worth making a note here that this isn't for depthstencil surfaces > (your code is right, just pointing this out) > >> + const unsigned ms_align_div[] = {2, 2, 4, 4}; >> + int j = 0; >> + /* num_samples must be power of 2. */ >> + assert((mt->num_samples & (mt->num_samples -1)) == 0); >> + >> + /* Compute array index. */ >> + while (mt->num_samples >> (j + 2)) >> +j++; >> + >> + return mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? >> +align_yf[i] / ms_align_div[j] : >> +align_ys[i] / ms_align_div[j]; >> + } >> + >> + return (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] >> + : align_ys[i]); >> +} >> + >> + > > I'd be in favor of a switch here, but it's up to you. If you follow what I > have > above, it's just this: > Switch does look simpler. I'll use it. > switch (mt->num_samples) { > case 2: > case 4: > divisor = 2; > break; > case 8: > case 16: > divisor = 4; > break; > default: > divisor = 1; > break; > } > > return ret_align / divisor; > >> +static unsigned int >> intel_horizontal_texture_alignment_unit(struct brw_context *brw, >> struct intel_mipmap_tree *mt) >> { >> @@ -79,6 +153,12 @@ intel_horizontal_texture_alignment_unit(struct >> brw_context *brw, >> if (mt->format == MESA_FORMAT_S_UINT8) >>return 8; >> >> + if (brw->gen >= 9 && mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) { >> + uint32_t align = tr_mode_horizontal_texture_alignment(brw, mt); >> + /* XY_FAST_COPY_BLT doesn't support horizontal alignment < 32. */ >> + return align < 32 ? 32 : align; >> + } >> + >> if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) >>return 8; >> > > > Seems a bit weird to me to split up HALIGN/VALIGN into separate patches, but > it's up to you. > > Reviewed-by: Ben Widawsky ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedeskt
Re: [Mesa-dev] [PATCH V2 09/22] i965/gen9: Set vertical alignment for the miptree
On Wed, May 27, 2015 at 12:04 PM, Ben Widawsky wrote: > On Fri, Apr 17, 2015 at 04:51:30PM -0700, Anuj Phogat wrote: >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/brw_tex_layout.c | 74 >> ++ >> 1 file changed, 74 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> index 20dcc21..9342101 100644 >> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> @@ -170,6 +170,69 @@ intel_horizontal_texture_alignment_unit(struct >> brw_context *brw, >> } >> >> static unsigned int >> +tr_mode_vertical_texture_alignment(struct brw_context *brw, >> + struct intel_mipmap_tree *mt) >> +{ >> + const unsigned *align_yf, *align_ys; >> + /* Bits per pixel/element. */ >> + const unsigned bpp = _mesa_get_format_bytes(mt->format) * 8; >> + >> + /* Vertical alignment tables for TRMODE_YF and TRMODE_YS. */ >> + const unsigned align_2d_yf[] = {64, 32, 32, 16, 16}; >> + const unsigned align_2d_ys[] = {256, 128, 128, 64, 64}; >> + const unsigned align_3d_yf[] = {16, 16, 16, 8, 8}; >> + const unsigned align_3d_ys[] = {32, 32, 32, 16, 16}; >> + int i = 0; >> + >> + assert(brw->gen >= 9 && >> + mt->target != GL_TEXTURE_1D && >> + mt->target != GL_TEXTURE_1D_ARRAY); >> + >> + /* Alignment computations below assume bpp >= 8 and a power of 2. */ >> + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); >> + >> + switch(mt->target) { >> + case GL_TEXTURE_2D: >> + case GL_TEXTURE_RECTANGLE: >> + case GL_TEXTURE_2D_ARRAY: >> + case GL_TEXTURE_CUBE_MAP: >> + case GL_TEXTURE_CUBE_MAP_ARRAY: >> + case GL_TEXTURE_2D_MULTISAMPLE: >> + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: >> + align_yf = align_2d_yf; >> + align_ys = align_2d_ys; >> + break; >> + case GL_TEXTURE_3D: >> + align_yf = align_3d_yf; >> + align_ys = align_3d_ys; >> + break; >> + default: >> + unreachable("not reached"); >> + } >> + >> + /* Compute array index. */ >> + while (bpp >> (i + 4)) >> + i++; >> + >> + if (mt->num_samples > 1) { >> + const unsigned ms_align_div[] = {1, 2, 2, 4}; >> + int j = 0; >> + /* num_samples must be power of 2. */ >> + assert((mt->num_samples & (mt->num_samples -1)) == 0); >> + >> + /* Compute array index. */ >> + while (mt->num_samples >> (j + 2)) >> + j++; >> + return mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? >> + align_yf[i] / ms_align_div[j] : >> + align_ys[i] / ms_align_div[j]; >> + } >> + >> + return (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF ? align_yf[i] >> + : align_ys[i]); >> +} >> + >> +static unsigned int > > I think all the comments I had for horizontal alignments apply here too. > >> intel_vertical_texture_alignment_unit(struct brw_context *brw, >>struct intel_mipmap_tree *mt) >> { >> @@ -202,6 +265,17 @@ intel_vertical_texture_alignment_unit(struct >> brw_context *brw, >> if (mt->format == MESA_FORMAT_S_UINT8) >>return brw->gen >= 7 ? 8 : 4; >> >> + /* Vertical texture alignment is not relevant for 1D texture targets. */ >> + if (brw->gen >= 9 && (mt->target == GL_TEXTURE_1D || >> + mt->target == GL_TEXTURE_1D_ARRAY)) >> + return 1; > > I think this hunk can go away with Neil's patches since then. > right. I'll remove it. >> + >> + if (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) { >> + uint32_t align = tr_mode_vertical_texture_alignment(brw, mt); >> + /* XY_FAST_COPY_BLT doesn't support vertical alignment < 64 */ >> + return align < 64 ? 64 : align; >> + } >> + > > Like the HALIGN, I don't see this restriction, can you tell me where it is. > >> /* Broadwell only supports VALIGN of 4, 8, and 16. The BSpec says 4 >> * should always be used, except for stencil buffers, which should be 8. >> */ > > Reviewed-by: Ben Widawsky ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/15] egl: import headers from Khronos EGL registry
Hi Chad, I broke up the patches and sent them as a separate patch series a few days ago. [PATCH 1/3] egl: import egl.h from registry (v2) [PATCH 2/3] egl: import eglext.h from registry and cleanup eglmesaext.h (v2) [PATCH 3/3] egl: import platform headers from registry (v2) Marek On Wed, May 27, 2015 at 8:20 PM, Chad Versace wrote: > On Thu 14 May 2015, Emil Velikov wrote: >> Hi Marek, >> On 12/05/15 22:54, Marek Olšák wrote: >> > From: Marek Olšák >> > >> > with the extension of keeping: >> > #define KHRONOS_APICALL __attribute__((visibility("default"))) >> > >> > And don't include mesa headers in egl.h. >> Can we do this more gradually (like below). It will ease the conflicts >> that this patch might cause. >> >> - egl.h - should have no side effects >> - eglext.h and eglmesaext.h - will clearly illustrate the parts that >> have been moved from the latter to the former. >> - eglplatform.h and khrplatform.h - there are a few controversial >> changes which might cause breakage. > > I want to see this patch broken up too. Currently, the patch contains > a straightforward header update layered on top of Marek's header > modifications. The two should be split into separate patches so it's > clear which changes are which. > >> > --- >> > include/EGL/egl.h | 562 >> > + >> > include/EGL/eglext.h | 259 +++-- >> > include/EGL/eglplatform.h | 45 +--- >> > include/KHR/khrplatform.h | 25 +- >> > src/egl/main/egltypedefs.h | 2 + >> > 5 files changed, 540 insertions(+), 353 deletions(-) >> > >> >> > diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h >> > index 2eb6865..1284089 100644 >> > --- a/include/EGL/eglplatform.h >> > +++ b/include/EGL/eglplatform.h >> >> > -#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */ >> > +#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__) >> > /* Symbian */ >> > >> > typedef int EGLNativeDisplayType; >> > typedef void *EGLNativeWindowType; >> > typedef void *EGLNativePixmapType; >> > >> > -#elif defined(WL_EGL_PLATFORM) >> > +#elif defined(__ANDROID__) || defined(ANDROID) >> > >> > -typedef struct wl_display *EGLNativeDisplayType; >> > -typedef struct wl_egl_pixmap *EGLNativePixmapType; >> > -typedef struct wl_egl_window *EGLNativeWindowType; >> > +#include >> > >> > -#elif defined(__GBM__) >> > - >> > -typedef struct gbm_device *EGLNativeDisplayType; >> > -typedef struct gbm_bo *EGLNativePixmapType; >> > -typedef void *EGLNativeWindowType; >> > - >> > -#elif defined(ANDROID) /* Android */ >> > - >> > -struct ANativeWindow; >> > struct egl_native_pixmap_t; >> > >> > -typedef struct ANativeWindow*EGLNativeWindowType; >> > -typedef struct egl_native_pixmap_t *EGLNativePixmapType; >> > -typedef void*EGLNativeDisplayType; >> > +typedef struct ANativeWindow* EGLNativeWindowType; >> > +typedef struct egl_native_pixmap_t* EGLNativePixmapType; >> > +typedef void* EGLNativeDisplayType; >> > >> > #elif defined(__unix__) >> > >> > -#if defined(MESA_EGL_NO_X11_HEADERS) >> > - >> > -typedef void*EGLNativeDisplayType; >> > -typedef khronos_uintptr_t EGLNativePixmapType; >> > -typedef khronos_uintptr_t EGLNativeWindowType; >> > - >> > -#else >> > - >> > /* X11 (tentative) */ >> > #include >> > #include >> > @@ -122,18 +103,8 @@ typedef Display *EGLNativeDisplayType; >> > typedef Pixmap EGLNativePixmapType; >> > typedef Window EGLNativeWindowType; >> > >> > -#endif /* MESA_EGL_NO_X11_HEADERS */ >> > - >> > -#elif __HAIKU__ >> > -#include >> > -typedef void *EGLNativeDisplayType; >> > -typedef khronos_uintptr_t EGLNativePixmapType; >> > -typedef khronos_uintptr_t EGLNativeWindowType; >> > - >> Upon closer look, one could get away with the above changes, although >> there may be something more subtle to it. >> >> Kristian, Chad, >> >> Would you have any suggestions for/against nuking the Wayland/GBM/other >> typedefs ? With an extra eye on the Haiku changes, what would it take to >> get the current eglplatform.h (or equivalent) accepted with Khronos ? > > I'm against. Khronos intended eglplatform.h specifically for this > purpose. To quote the Khronos API Implementer's Guide: > > Implementers may need to modify eglplatform.h. In particular the > eglNativeDisplayType, eglNativeWindowType, and eglNativePixmapType > typedefs must be defined as appropriate for the platform (typically, > they will be typedef'ed to corresponding types in the native window > system). Developer documentation should mention the correspondence so > that developers know what parameters to pass to eglCreateWindowSurface, > eglCreatePixmapSurface, and eglCopyBuffers. Documentation should also > describe the format of the display_id parameter to eglGetDisplay, since > this is a pl
Re: [Mesa-dev] [PATCH 1/3] egl: import egl.h from registry (v2)
On Sat 23 May 2015, Marek Olšák wrote: > From: Marek Olšák > > v2: split the commit into 3 patches Thanks for splitting the patches up. For the series, Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/15] egl: import headers from Khronos EGL registry
On Thu 28 May 2015, Marek Olšák wrote: > Hi Chad, > > I broke up the patches and sent them as a separate patch series a few days > ago. > > [PATCH 1/3] egl: import egl.h from registry (v2) > [PATCH 2/3] egl: import eglext.h from registry and cleanup eglmesaext.h (v2) > [PATCH 3/3] egl: import platform headers from registry (v2) Thanks for splitting them up. I just replied to that series in a separate message. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] docs/GL3.txt: update softpipe arb_gpu_shader5 status.
On 27 May 2015 at 21:23, Ilia Mirkin wrote: > On Wed, May 27, 2015 at 3:46 AM, Dave Airlie wrote: >> Signed-off-by: Dave Airlie >> --- >> docs/GL3.txt | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/docs/GL3.txt b/docs/GL3.txt >> index 8e1c8cd..9fbde36 100644 >> --- a/docs/GL3.txt >> +++ b/docs/GL3.txt >> @@ -99,13 +99,13 @@ GL 4.0, GLSL 4.00: >>GL_ARB_gpu_shader5 DONE (i965, nvc0) >>- 'precise' qualifierDONE >>- Dynamically uniform sampler array indices DONE (r600) >> - - Dynamically uniform UBO array indices DONE (r600) >> + - Dynamically uniform UBO array indices DONE (r600, softpipe) > > I didn't see patches for this.. does it already Just Work? it seems to pass all the tests, Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] softpipe: add support for indexed queries.
Am 27.05.2015 um 09:45 schrieb Dave Airlie: > We need indexed queries to retrieve the geom shader info. > > Signed-off-by: Dave Airlie > --- > src/gallium/drivers/softpipe/sp_context.h | 2 +- > src/gallium/drivers/softpipe/sp_prim_vbuf.c | 4 ++-- > src/gallium/drivers/softpipe/sp_query.c | 23 --- > src/gallium/include/pipe/p_state.h | 2 +- > 4 files changed, 16 insertions(+), 15 deletions(-) > > diff --git a/src/gallium/drivers/softpipe/sp_context.h > b/src/gallium/drivers/softpipe/sp_context.h > index 50a7336..fa91854 100644 > --- a/src/gallium/drivers/softpipe/sp_context.h > +++ b/src/gallium/drivers/softpipe/sp_context.h > @@ -91,7 +91,7 @@ struct softpipe_context { > struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS]; > unsigned num_so_targets; > > - struct pipe_query_data_so_statistics so_stats; > + struct pipe_query_data_so_statistics so_stats[PIPE_MAX_VERTEX_STREAMS]; > > struct pipe_query_data_pipeline_statistics pipeline_statistics; > unsigned active_statistics_queries; > diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c > b/src/gallium/drivers/softpipe/sp_prim_vbuf.c > index 5809fd5..6c16d9c 100644 > --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c > +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c > @@ -602,8 +602,8 @@ sp_vbuf_so_info(struct vbuf_render *vbr, uint stream, > uint primitives, uint prim > struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr); > struct softpipe_context *softpipe = cvbr->softpipe; > > - softpipe->so_stats.num_primitives_written += primitives; > - softpipe->so_stats.primitives_storage_needed += prim_generated; > + softpipe->so_stats[stream].num_primitives_written += primitives; > + softpipe->so_stats[stream].primitives_storage_needed += prim_generated; > } > > static void > diff --git a/src/gallium/drivers/softpipe/sp_query.c > b/src/gallium/drivers/softpipe/sp_query.c > index e773870..0707219 100644 > --- a/src/gallium/drivers/softpipe/sp_query.c > +++ b/src/gallium/drivers/softpipe/sp_query.c > @@ -39,6 +39,7 @@ > > struct softpipe_query { > unsigned type; > + unsigned index; > uint64_t start; > uint64_t end; > struct pipe_query_data_so_statistics so; > @@ -71,7 +72,7 @@ softpipe_create_query(struct pipe_context *pipe, >type == PIPE_QUERY_TIMESTAMP_DISJOINT); > sq = CALLOC_STRUCT( softpipe_query ); > sq->type = type; > - > + sq->index = index; > return (struct pipe_query *)sq; > } > > @@ -98,17 +99,17 @@ softpipe_begin_query(struct pipe_context *pipe, struct > pipe_query *q) >sq->start = os_time_get_nano(); >break; > case PIPE_QUERY_SO_STATISTICS: > - sq->so.num_primitives_written = > softpipe->so_stats.num_primitives_written; > - sq->so.primitives_storage_needed = > softpipe->so_stats.primitives_storage_needed; > + sq->so.num_primitives_written = > softpipe->so_stats[0].num_primitives_written; > + sq->so.primitives_storage_needed = > softpipe->so_stats[0].primitives_storage_needed; >break; > case PIPE_QUERY_SO_OVERFLOW_PREDICATE: >sq->end = FALSE; >break; > case PIPE_QUERY_PRIMITIVES_EMITTED: > - sq->so.num_primitives_written = > softpipe->so_stats.num_primitives_written; > + sq->so.num_primitives_written = > softpipe->so_stats[sq->index].num_primitives_written; >break; > case PIPE_QUERY_PRIMITIVES_GENERATED: > - sq->so.primitives_storage_needed = > softpipe->so_stats.primitives_storage_needed; > + sq->so.primitives_storage_needed = > softpipe->so_stats[sq->index].primitives_storage_needed; >break; > case PIPE_QUERY_TIMESTAMP: > case PIPE_QUERY_GPU_FINISHED: > @@ -154,24 +155,24 @@ softpipe_end_query(struct pipe_context *pipe, struct > pipe_query *q) >break; > case PIPE_QUERY_SO_OVERFLOW_PREDICATE: >sq->so.num_primitives_written = > - softpipe->so_stats.num_primitives_written - > sq->so.num_primitives_written; > + softpipe->so_stats[0].num_primitives_written - > sq->so.num_primitives_written; >sq->so.primitives_storage_needed = > - softpipe->so_stats.primitives_storage_needed - > sq->so.primitives_storage_needed; > + softpipe->so_stats[0].primitives_storage_needed - > sq->so.primitives_storage_needed; >sq->end = sq->so.primitives_storage_needed > > sq->so.num_primitives_written; >break; > case PIPE_QUERY_SO_STATISTICS: >sq->so.num_primitives_written = > - softpipe->so_stats.num_primitives_written - > sq->so.num_primitives_written; > + softpipe->so_stats[sq->index].num_primitives_written - > sq->so.num_primitives_written; >sq->so.primitives_storage_needed = > - softpipe->so_stats.primitives_storage_needed - > sq->so.primitives_storage_needed; > + softpipe->so_stats[sq->index].primitives_storage_needed - > sq->so.primitives_storage_needed
Re: [Mesa-dev] [PATCH 5/7] draw: add support to tgsi paths for geometry streams.
Am 27.05.2015 um 09:45 schrieb Dave Airlie: > This hooks up the geometry shader processing to the TGSI > support added in the previous commits. > > It doesn't change the llvm interface other than to > keep things building. > > Signed-off-by: Dave Airlie > --- > src/gallium/auxiliary/draw/draw_gs.c | 195 > + > src/gallium/auxiliary/draw/draw_gs.h | 21 ++- > src/gallium/auxiliary/draw/draw_pt.h | 1 + > .../auxiliary/draw/draw_pt_fetch_shade_pipeline.c | 16 +- > .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 14 +- > src/gallium/auxiliary/draw/draw_pt_so_emit.c | 64 --- > 6 files changed, 192 insertions(+), 119 deletions(-) > > diff --git a/src/gallium/auxiliary/draw/draw_gs.c > b/src/gallium/auxiliary/draw/draw_gs.c > index 755e527..9798518 100644 > --- a/src/gallium/auxiliary/draw/draw_gs.c > +++ b/src/gallium/auxiliary/draw/draw_gs.c > @@ -75,6 +75,7 @@ draw_gs_should_flush(struct draw_geometry_shader *shader) > /*#define DEBUG_OUTPUTS 1*/ > static void > tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, > + unsigned stream, >unsigned num_primitives, >float (**p_output)[4]) > { > @@ -89,14 +90,16 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, > */ > > for (prim_idx = 0; prim_idx < num_primitives; ++prim_idx) { > - unsigned num_verts_per_prim = machine->Primitives[prim_idx]; > - shader->primitive_lengths[prim_idx + shader->emitted_primitives] = > - machine->Primitives[prim_idx]; > - shader->emitted_vertices += num_verts_per_prim; > + unsigned num_verts_per_prim = machine->Primitives[stream][prim_idx]; > + > + shader->stream[stream].primitive_lengths[prim_idx + > shader->stream[stream].emitted_primitives] = I'm not really trying to enforce strict 80 column lines, but over 100 is definitely too much. I know there's some other ridiculously long lines in some draw and tgsi code but please try to avoid new ones. > + machine->Primitives[stream][prim_idx]; > + shader->stream[stream].emitted_vertices += num_verts_per_prim; Did things actually still build/work around here before this patch (that is if you just applied patches up to 3/7)? Looks to me like it might not, if so it would be nice to get this fixed. >for (j = 0; j < num_verts_per_prim; j++, current_idx++) { > - int idx = current_idx * shader->info.num_outputs; > + int idx = machine->PrimitiveOffsets[stream][prim_idx] + current_idx > * shader->info.num_outputs; linelength > #ifdef DEBUG_OUTPUTS > - debug_printf("%d) Output vert:\n", idx / shader->info.num_outputs); > + debug_printf("%d/%d) Output vert:\n", stream, idx / > shader->info.num_outputs); > #endif > for (slot = 0; slot < shader->info.num_outputs; slot++) { > output[slot][0] = machine->Outputs[idx + slot].xyzw[0].f[0]; > @@ -115,7 +118,7 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader, >} > } > *p_output = output; > - shader->emitted_primitives += num_primitives; > + shader->stream[stream].emitted_primitives += num_primitives; > } > > /*#define DEBUG_INPUTS 1*/ > @@ -201,11 +204,12 @@ static void tgsi_gs_prepare(struct draw_geometry_shader > *shader, > } > } > > -static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, > -unsigned input_primitives) > +static void tgsi_gs_run(struct draw_geometry_shader *shader, > +unsigned input_primitives, > +unsigned *out_prims) whitespace? > { > struct tgsi_exec_machine *machine = shader->machine; > - > + int i; > tgsi_set_exec_mask(machine, >1, >input_primitives > 1, > @@ -215,8 +219,30 @@ static unsigned tgsi_gs_run(struct draw_geometry_shader > *shader, > /* run interpreter */ > tgsi_exec_machine_run(machine); > > - return > - > machine->Temps[TGSI_EXEC_TEMP_PRIMITIVE_I].xyzw[TGSI_EXEC_TEMP_PRIMITIVE_C].u[0]; > + for (i = 0; i < 4; i++) { > + int prim_i; > + int prim_c; > + switch (i) { > + case 0: > + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_I; > + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_C; > + break; > + case 1: > + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S1_I; > + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S1_C; > + break; > + case 2: > + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S2_I; > + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S2_C; > + break; > + case 3: > + prim_i = TGSI_EXEC_TEMP_PRIMITIVE_S3_I; > + prim_c = TGSI_EXEC_TEMP_PRIMITIVE_S3_C; > + break; > + }; > + > + out_prims[i] = machine->Temps[prim_i].xyzw[prim_c].u[0]; > + } > } > > #ifdef HAVE_LLVM > @@ -293,6 +319,7 @@ llvm_fetch_gs_input(struct draw_geometry_shader
[Mesa-dev] [PATCH] i965: Disable compaction for EOT send messages
AFAICT, there is no real way to make sure a send message with EOT is properly ignored from compact, nor can I see a way to actually encode EOT while compacting. Before the single send optimization we'd always bail because we hit the is_immediate && !is_compactable_immediate case. However, with single send, is_immediate is not true, and so we end up trying to compact the un-compactible. Without this, any compacting single send instruction will hang because the EOT isn't there. I am not sure how I didn't hit this when I originally enabled the optimization. I didn't check if some surrounding code changed. NOTE: This needs another piglit run or two before merge. I know Neil and Matt were both looking into this. I did a quick search and didn't see any patches out there to handle this. Please ignore if this has already been sent by someone. (Direct me to it and I will review it). Cc: Matt Turner Cc: Neil Roberts Cc: Mark Janes Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/brw_eu_compact.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index 69cb114..67f0b45 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -849,6 +849,12 @@ set_3src_source_index(const struct brw_device_info *devinfo, static bool has_unmapped_bits(const struct brw_device_info *devinfo, brw_inst *src) { + /* EOT can only be mapped on a send if the src1 is an immediate */ + if ((brw_inst_opcode(devinfo, src) == BRW_OPCODE_SENDC || +brw_inst_opcode(devinfo, src) == BRW_OPCODE_SEND) && + brw_inst_eot(devinfo, src)) + return true; + /* Check for instruction bits that don't map to any of the fields of the * compacted instruction. The instruction cannot be compacted if any of * them are set. They overlap with: -- 2.4.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Disable compaction for EOT send messages
I forgot to mention that the predecessor to this patch fixed around 70 BSW failures on Jenkins. Note that our current Jenkins baseline for BSW is probably not fully baked atm. On Wed, May 27, 2015 at 10:16:04PM -0700, Ben Widawsky wrote: > AFAICT, there is no real way to make sure a send message with EOT is properly > ignored from compact, nor can I see a way to actually encode EOT while > compacting. Before the single send optimization we'd always bail because we > hit > the is_immediate && !is_compactable_immediate case. However, with single send, > is_immediate is not true, and so we end up trying to compact the > un-compactible. > > Without this, any compacting single send instruction will hang because the EOT > isn't there. I am not sure how I didn't hit this when I originally enabled the > optimization. I didn't check if some surrounding code changed. > > NOTE: This needs another piglit run or two before merge. > > I know Neil and Matt were both looking into this. I did a quick search and > didn't see any patches out there to handle this. Please ignore if this has > already been sent by someone. (Direct me to it and I will review it). > > Cc: Matt Turner > Cc: Neil Roberts > Cc: Mark Janes > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_eu_compact.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c > b/src/mesa/drivers/dri/i965/brw_eu_compact.c > index 69cb114..67f0b45 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c > @@ -849,6 +849,12 @@ set_3src_source_index(const struct brw_device_info > *devinfo, > static bool > has_unmapped_bits(const struct brw_device_info *devinfo, brw_inst *src) > { > + /* EOT can only be mapped on a send if the src1 is an immediate */ > + if ((brw_inst_opcode(devinfo, src) == BRW_OPCODE_SENDC || > +brw_inst_opcode(devinfo, src) == BRW_OPCODE_SEND) && > + brw_inst_eot(devinfo, src)) > + return true; > + > /* Check for instruction bits that don't map to any of the fields of the > * compacted instruction. The instruction cannot be compacted if any of > * them are set. They overlap with: > -- > 2.4.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] glsl: validate sampler array indexing for 'constant-index-expression'
On 05/27/2015 02:30 AM, Francisco Jerez wrote: Ian Romanick writes: On 05/26/2015 02:04 PM, Francisco Jerez wrote: Ian Romanick writes: On 05/26/2015 02:53 AM, Tapani Pälli wrote: Hello; I'd like to ping if this approach would be ok. We've had some discussions with Curro about it and overall it would seem nicer to move this check to happen at compile time. However, this seems quite a problematic move. I'll try explain below why; The overall problem with the failing use cases in the bug is that loop unroll does not happen. It does not happen because loop analysis does not know how to deal with functions and there is a texture2D call inside the loop. Now what follows is that because unroll does not happen the array index (loop induction variable) does not become constant during compilation, this will happen only after linking (where unroll finally happens due to function inlining which allows further optimization). I have a hacky patch where I force unroll to happen early when only builtin calls are found inside loop and it works *but* unfortunately it does not help since in the unrolled result we still have sampler array indexing with a non-constant variable 'i', it will be constant only later after linking phase when function inlining and further optimizations happen. It looks like this (I modified ir print output a bit to fit in email): 1st round: assign var_ref i constant_int 0 call texture2D (constant int 0) 2nd round: assign var_ref i (var_ref i + constant_int 1) call texture2D (var_ref i) So at this point I got a bit tired of this approach. IMO linker check is sufficient and according the spec. Spec does not explicitly specify a compiler or linker error for this case but it does say: I agree. We could handle GLSL ES at compile time because there is only one compilation unit per stage, but I'm not convinced handling it special is worth any effort. I agree with both of you that it's not too important whether this validation happens at compile time or link time, what I find worrying is that we currently have no guarantee that sampler indexing expression of the form given by the spec (a "constant-index-expression") will actually be lowered into a constant by link time, so the check introduced in this patch may give a false positive in cases where the array index has the allowed form, like: | sampler2D tex[N]; | | for (i = 0; i < M; i++) { | vec4 x = texture(tex[some_complex_constant_expression_of(i)], ...); | // Very many instructions here, so the loop unrolling pass won't | // have the temptation of unrolling the loop even after linking. | } Admittedly without this check the situation was even worse because the indexing expression would most likely not have been in the form expected by the back-end, so it could have crashed or misrendered at a later point, even though this is a required feature of GLSL ES <3.00. IMHO the loop unrolling pass needs to be fixed to consider sampler indexing with a constant-index-expression (or some easier superset of that, like arbitrary non-constant expressions) as a kind of unsupported array indexing like it already does for other cases, otherwise the valid programs may fail to compile or not depending on the outcome of the loop unrolling heuristic. I think "need" is perhaps too strong. The point that you've hit on here is, in fact, the reason for the change between GLSL ES 1.00 and GLSL ES 3.00. :) We haven't yet encountered a valid application that has a shader that won't compile. At this point I don't think it's likely that we ever will. - Hardware increasingly "just works," so the restriction is unnecessary. - The "hard" restriction in GLSL ES 3.00. - Developer "tribal knowledge" that this is dangerous. The use of NIR is gradually moving up in the linker pipeline. Even if this were moderately important, I don't think it's worth investing effort in the existing loop infrastructure. That said, we should keep this firmly in mind as the NIR loop-handling infrastructure matures. At that point we can probably also revert this change. I was thinking we could just change the loop unrolling pass to force unroll loops in which a sampler array expression occurs with non-constant index (which are otherwise disallowed by the spec) if we are on GLSL ES 1.00. That could lead to slightly more aggressive unrolling than necessary in some cases, but it would make sure that all cases mentioned in the spec are covered, and it should be easy to do. OK, I'll try to make a test case to hit this first. I've also noticed that there's a WebGL 2.0 conformance test that assumes similar loop induction variable indexing even though this was removed in GLSL ES 3.0, I've filed a bug to Khronos to discuss if the test could be removed or the test should be modified to use GLSL ES 1.0. The linker check even here can be somewhat problematic. Back-ends do additional optimization, so they may be able to make some of these accesses be non-dynamic.
Re: [Mesa-dev] [PATCH] i965: Silence warning in 3-src type-setting.
On Wed, May 27, 2015 at 12:24:47PM -0700, Matt Turner wrote: > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 2 ++ > 1 file changed, 2 insertions(+) Reviewed-by: Topi Pohjolainen > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index e78d0be..a1d11f3 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -914,6 +914,8 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct > brw_reg dest, > brw_inst_set_3src_src_type(devinfo, inst, BRW_3SRC_TYPE_UD); > brw_inst_set_3src_dst_type(devinfo, inst, BRW_3SRC_TYPE_UD); > break; > + default: > + unreachable("not reached"); >} > } > > -- > 2.3.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev