Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.
On 18/07/15 01:38, Eric Anholt wrote: Emil Velikov writes: On 14/07/15 19:45, Eric Anholt wrote: These are really useful hints to the compiler in the absence of link-time optimization, and I'm going to use them in VC4. I've made the const attribute be ATTRIBUTE_CONST unlike other function attributes, because we have other things in the tree #defining CONST for their own unrelated purposes. Mindly related: how people feel about making these macros less screamy, by following the approach used in the kernel: PURE -> __pure and so on ? I'd love it. Less screamy is fine, but beware prefixing double underscore: the C standard stipulates that its use is reserved for for C/C++ runtime. [1] Look at stdlibc++ implementation: every internal variable has a double underscore prefix. Maybe kernel gets away on GLIBC (and because it doesn't use C++), but there's no guarantee it will work on other C runtimes, and even if it does, it could start failing anytime. Jose [1] http://stackoverflow.com/a/224420 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1
https://bugs.freedesktop.org/show_bug.cgi?id=91254 me...@frugalware.org changed: What|Removed |Added CC||me...@frugalware.org --- Comment #6 from me...@frugalware.org --- *** Bug 91343 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 16/18] i965: Prevent coordinate overflow in intel_emit_linear_blit
On Fri, Jul 17, 2015 at 05:12:54PM -0700, Anuj Phogat wrote: > On Mon, Jul 6, 2015 at 3:33 AM, Chris Wilson wrote: > > + do { > > + /* The pitch given to the GPU must be DWORD aligned, and > > + * we want width to match pitch. Max width is (1 << 15 - 1), > > + * rounding that down to the nearest DWORD is 1 << 15 - 4 > > + */ > > + pitch = ROUND_DOWN_TO(MIN2(size, (1 << 15) - 64), 4); > I understand why you are subtracting 64 in above statement, it'll > be nice to update above comment explaining the reason. We use the pitch to set the copy width, so the maximum x coordinate becomes src_x + pitch. Since src_x has a maximum value of 63, we want to make sure that pitch is less than 32627-63. Simplified above. > > + height = (size < pitch || pitch == 0) ? 1 : size / pitch; ... > > + pitch *= height; > > + if (size <= pitch) > I think size < pitch will never be true. How about: > assert(size < pitch); For a single row copy, size can be less than pitch. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH excerpt] mesa: Rename _mesa_lookup_enum_by_nr() to _mesa_enum_to_string().
Generated by sed; no manual changes. Signed-off-by: Kenneth Graunke --- We talked about doing this back in 2013, but the patches never quite materialized. Here's the obvious sed job. Actual patch is here: http://cgit.freedesktop.org/~kwg/mesa/commit/?h=enumtostring I figured I'd spare the mailing list 146Kb of obvious diff. Thoughts? src/mapi/glapi/gen/gl_enums.py | 2 +- src/mesa/drivers/common/meta_blit.c | 2 +- src/mesa/drivers/common/meta_generate_mipmap.c | 2 +- src/mesa/drivers/dri/i915/i830_state.c | 20 ++-- src/mesa/drivers/dri/i915/intel_fbo.c| 2 +- src/mesa/drivers/dri/i915/intel_mipmap_tree.c| 2 +- src/mesa/drivers/dri/i915/intel_render.c | 2 +- src/mesa/drivers/dri/i915/intel_tex_image.c | 2 +- src/mesa/drivers/dri/i915/intel_tex_subimage.c | 2 +- src/mesa/drivers/dri/i915/intel_tris.c | 4 +- src/mesa/drivers/dri/i965/brw_draw.c | 8 +- src/mesa/drivers/dri/i965/brw_draw_upload.c | 2 +- src/mesa/drivers/dri/i965/gen6_cc.c | 4 +- src/mesa/drivers/dri/i965/intel_fbo.c| 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 2 +- src/mesa/drivers/dri/i965/intel_tex_image.c | 4 +- src/mesa/drivers/dri/i965/intel_tex_subimage.c | 4 +- src/mesa/drivers/dri/r200/r200_state.c | 2 +- src/mesa/drivers/dri/r200/r200_tex.c | 18 ++-- src/mesa/drivers/dri/radeon/radeon_common.c | 2 +- src/mesa/drivers/dri/radeon/radeon_fbo.c | 2 +- src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c | 2 +- src/mesa/drivers/dri/radeon/radeon_pixel_read.c | 2 +- src/mesa/drivers/dri/radeon/radeon_state.c | 2 +- src/mesa/drivers/dri/radeon/radeon_swtcl.c | 2 +- src/mesa/drivers/dri/radeon/radeon_tex.c | 6 +- src/mesa/drivers/dri/radeon/radeon_texture.c | 4 +- src/mesa/main/api_validate.c | 2 +- src/mesa/main/atifragshader.c| 30 +++--- src/mesa/main/attrib.c | 2 +- src/mesa/main/blend.c| 34 +++ src/mesa/main/blit.c | 8 +- src/mesa/main/bufferobj.c| 14 +-- src/mesa/main/buffers.c | 22 ++--- src/mesa/main/clear.c| 8 +- src/mesa/main/condrender.c | 4 +- src/mesa/main/copyimage.c| 8 +- src/mesa/main/debug.c| 2 +- src/mesa/main/depth.c| 2 +- src/mesa/main/dlist.c| 14 +-- src/mesa/main/drawpix.c | 18 ++-- src/mesa/main/enable.c | 12 +-- src/mesa/main/enums.h| 2 +- src/mesa/main/errors.c | 4 +- src/mesa/main/fbobject.c | 72 +++--- src/mesa/main/feedback.c | 2 +- src/mesa/main/formatquery.c | 8 +- src/mesa/main/framebuffer.c | 2 +- src/mesa/main/genmipmap.c| 2 +- src/mesa/main/get.c | 8 +- src/mesa/main/getstring.c| 4 +- src/mesa/main/glformats.c| 6 +- src/mesa/main/hint.c | 4 +- src/mesa/main/light.c| 6 +- src/mesa/main/matrix.c | 8 +- src/mesa/main/objectlabel.c | 2 +- src/mesa/main/pipelineobj.c | 2 +- src/mesa/main/polygon.c | 8 +- src/mesa/main/program_resource.c | 20 ++-- src/mesa/main/queryobj.c | 28 +++--- src/mesa/main/readpix.c | 12 +-- src/mesa/main/samplerobj.c | 20 ++-- src/mesa/main/shader_query.cpp | 14 +-- src/mesa/main/shaderapi.c| 8 +- src/mesa/main/shaderimage.c | 2 +- src/mesa/main/tests/enum_strings.cpp | 4 +- src/mesa/main/texenv.c | 10 +- src/mesa/main/texformat.c| 2 +- src/mesa/main/texgen.c | 6 +- src/mesa/main/texgetimage.c | 4 +- src/mesa/main/teximage.c | 114 +++ src/mesa/main/texobj.c | 6 +- src/mesa/main/texparam.c | 24 ++--- src/mesa/main/texstate.c | 36 +++ src/mesa/main/texstate.h | 2 +- src/mesa/main/texstorage.c | 16 ++-- src/mesa/main/textureview.c | 10 +- src/mesa/main/uniforms.c
Re: [Mesa-dev] [PATCH] radeonsi: don't return NULL fence if no fence is available
Michel Dänzer writes: > On 17.07.2015 06:03, Marek Olšák wrote: >> From: Marek Olšák >> >> An alternative (and ugly) solution to the current clover issue. > > How about something like this instead? (Compile tested only) > I'm rather unfamiliar with the radeonsi pipe driver code so I should probably hold myself back from giving you an R-b, but I must say that this seems much cleaner than the last two solutions proposed so far... :) > > diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c > b/src/gallium/drivers/radeonsi/si_hw_context.c > index 08cc08e..dc8702e 100644 > --- a/src/gallium/drivers/radeonsi/si_hw_context.c > +++ b/src/gallium/drivers/radeonsi/si_hw_context.c > @@ -84,7 +84,8 @@ void si_context_gfx_flush(void *context, unsigned flags, > struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs; > struct radeon_winsys *ws = ctx->b.ws; > > - if (cs->cdw == ctx->b.initial_gfx_cs_size) { > + if (cs->cdw == ctx->b.initial_gfx_cs_size && > + (!fence || ctx->last_gfx_fence)) { > if (fence) > ws->fence_reference(fence, ctx->last_gfx_fence); > if (!(flags & RADEON_FLUSH_ASYNC)) > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/12] i965/fs: Use exec_size instead of dispatch_width to determine the message variant.
dispatch_width is global for a single compilation and doesn't necessarily match the desired execution width if we had to lower the original full-width instruction due to hardware limitations. These were all inside a Gen4-specific branch so this patch shouldn't have any effect on more recent hardware. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index a176fcf..811fb73 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -655,7 +655,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src /* Note that G45 and older determines shadow compare and dispatch width * from message length for most messages. */ - if (dispatch_width == 8) { + if (inst->exec_size == 8) { msg_type = BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE; if (inst->shadow_compare) { assert(inst->mlen == 6); @@ -674,7 +674,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src break; case FS_OPCODE_TXB: if (inst->shadow_compare) { -assert(dispatch_width == 8); +assert(inst->exec_size == 8); assert(inst->mlen == 6); msg_type = BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE; } else { @@ -685,7 +685,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src break; case SHADER_OPCODE_TXL: if (inst->shadow_compare) { -assert(dispatch_width == 8); +assert(inst->exec_size == 8); assert(inst->mlen == 6); msg_type = BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_LOD_COMPARE; } else { @@ -696,7 +696,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src break; case SHADER_OPCODE_TXD: /* There is no sample_d_c message; comparisons are done manually */ - assert(dispatch_width == 8); + assert(inst->exec_size == 8); assert(inst->mlen == 7 || inst->mlen == 10); msg_type = BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_GRADIENTS; break; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/12] i965/fs: Fix opt_zero_samples() for texturing ops not matching dispatch_width.
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 6afb9fe..c31a0e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2145,11 +2145,11 @@ fs_visitor::opt_zero_samples() * "Parameter 0 is required except for the sampleinfo message, which * has no parameter 0" */ - while (inst->mlen > inst->header_size + dispatch_width / 8 && + while (inst->mlen > inst->header_size + inst->exec_size / 8 && load_payload->src[(inst->mlen - inst->header_size) / - (dispatch_width / 8) + + (inst->exec_size / 8) + inst->header_size - 1].is_zero()) { - inst->mlen -= dispatch_width / 8; + inst->mlen -= inst->exec_size / 8; progress = true; } } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/12] i965/fs: Implement lowering of logical texturing opcodes on Gen5-6.
This should be largely equivalent to emit_texture_gen5() except for slight codestyle changes and the use i965 opcodes instead of the ir_texture_opcode enum, see "i965/fs: Implement lowering of logical texturing opcodes on Gen7+." for the mapping between them. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 103 +++ 1 file changed, 103 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 7387ca5..5233ac3 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3369,6 +3369,104 @@ lower_fb_write_logical_send(const fs_builder &bld, fs_inst *inst, inst->header_size = header_size; } +static void +lower_sampler_logical_send_gen5(const fs_builder &bld, fs_inst *inst, opcode op, +fs_reg coordinate, +const fs_reg &shadow_c, +fs_reg lod, fs_reg lod2, +const fs_reg &sample_index, +const fs_reg &sampler, +const fs_reg &offset_value, +unsigned coord_components, +unsigned grad_components) +{ + fs_reg message(MRF, 2, BRW_REGISTER_TYPE_F); + fs_reg msg_coords = message; + unsigned header_size = 0; + + if (offset_value.file != BAD_FILE) { + /* The offsets set up by the visitor are in the m1 header, so we can't + * go headerless. + */ + header_size = 1; + message.reg--; + } + + for (unsigned i = 0; i < coord_components; i++) { + bld.MOV(retype(offset(msg_coords, bld, i), coordinate.type), coordinate); + coordinate = offset(coordinate, bld, 1); + } + fs_reg msg_end = offset(msg_coords, bld, coord_components); + fs_reg msg_lod = offset(msg_coords, bld, 4); + + if (shadow_c.file != BAD_FILE) { + fs_reg msg_shadow = msg_lod; + bld.MOV(msg_shadow, shadow_c); + msg_lod = offset(msg_shadow, bld, 1); + msg_end = msg_lod; + } + + switch (op) { + case SHADER_OPCODE_TXL: + case FS_OPCODE_TXB: + bld.MOV(msg_lod, lod); + msg_end = offset(msg_lod, bld, 1); + break; + case SHADER_OPCODE_TXD: + /** + * P = u,v,r + * dPdx = dudx, dvdx, drdx + * dPdy = dudy, dvdy, drdy + * + * Load up these values: + * - dudx dudy dvdx dvdy drdx drdy + * - dPdx.x dPdy.x dPdx.y dPdy.y dPdx.z dPdy.z + */ + msg_end = msg_lod; + for (unsigned i = 0; i < grad_components; i++) { + bld.MOV(msg_end, lod); + lod = offset(lod, bld, 1); + msg_end = offset(msg_end, bld, 1); + + bld.MOV(msg_end, lod2); + lod2 = offset(lod2, bld, 1); + msg_end = offset(msg_end, bld, 1); + } + break; + case SHADER_OPCODE_TXS: + msg_lod = retype(msg_end, BRW_REGISTER_TYPE_UD); + bld.MOV(msg_lod, lod); + msg_end = offset(msg_lod, bld, 1); + break; + case SHADER_OPCODE_TXF: + msg_lod = offset(msg_coords, bld, 3); + bld.MOV(retype(msg_lod, BRW_REGISTER_TYPE_UD), lod); + msg_end = offset(msg_lod, bld, 1); + break; + case SHADER_OPCODE_TXF_CMS: + msg_lod = offset(msg_coords, bld, 3); + /* lod */ + bld.MOV(retype(msg_lod, BRW_REGISTER_TYPE_UD), fs_reg(0u)); + /* sample index */ + bld.MOV(retype(offset(msg_lod, bld, 1), BRW_REGISTER_TYPE_UD), sample_index); + msg_end = offset(msg_lod, bld, 2); + break; + default: + break; + } + + inst->opcode = op; + inst->src[0] = reg_undef; + inst->src[1] = sampler; + inst->resize_sources(2); + inst->base_mrf = message.reg; + inst->mlen = msg_end.reg - message.reg; + inst->header_size = header_size; + + /* Message length > MAX_SAMPLER_MESSAGE_SIZE disallowed by hardware. */ + assert(inst->mlen <= MAX_SAMPLER_MESSAGE_SIZE); +} + static bool is_high_sampler(const struct brw_device_info *devinfo, const fs_reg &sampler) { @@ -3604,6 +3702,11 @@ lower_sampler_logical_send(const fs_builder &bld, fs_inst *inst, opcode op) shadow_c, lod, lod2, sample_index, mcs, sampler, offset_value, coord_components, grad_components); + } else if (devinfo->gen >= 5) { + lower_sampler_logical_send_gen5(bld, inst, op, coordinate, + shadow_c, lod, lod2, sample_index, + sampler, offset_value, + coord_components, grad_components); } else { assert(!"Not implemented"); } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.
This should match the set of cases in which we currently call fail() or no16() from the emit_texture_*() methods and the ones in which emit_texture_gen4() enables the SIMD16 workaround. Hint for reviewers: It's not a big deal if I happen to have missed some case here, it will just lead to an assertion failure down the road which is easily fixable, however being stricter than necessary won't cause any visible breakage, it would just decrease performance silently due to the unnecessary message splitting, so feel free to double-check that all cases listed here already cause a SIMD8/16 fall-back with the current texturing code -- You may want to skip over the Gen5-6 cases though if you don't have pencil and paper at hand. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 043d9e9..f291202 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3918,6 +3918,33 @@ get_lowered_simd_width(const struct brw_device_info *devinfo, /* Dual-source FB writes are unsupported in SIMD16 mode. */ return (inst->src[1].file != BAD_FILE ? 8 : inst->exec_size); + case SHADER_OPCODE_TXD_LOGICAL: + /* TXD is unsupported in SIMD16 mode. */ + return 8; + + case SHADER_OPCODE_TG4_OFFSET_LOGICAL: { + /* gather4_po_c is unsupported in SIMD16 mode. */ + const fs_reg &shadow_c = inst->src[1]; + return (shadow_c.file != BAD_FILE ? 8 : inst->exec_size); + } + case SHADER_OPCODE_TXL_LOGICAL: + case FS_OPCODE_TXB_LOGICAL: { + /* Gen4 doesn't have SIMD8 non-shadow-compare bias/LOD instructions, and + * Gen4-6 don't support TXL and TXB with shadow comparison in SIMD16 + * mode. + */ + const fs_reg &shadow_c = inst->src[1]; + return (devinfo->gen == 4 && shadow_c.file == BAD_FILE ? 16 : + devinfo->gen < 7 && shadow_c.file != BAD_FILE ? 8 : + inst->exec_size); + } + case SHADER_OPCODE_TXF_LOGICAL: + case SHADER_OPCODE_TXS_LOGICAL: + /* Gen4 doesn't have SIMD8 variants for the RESINFO and LD-with-LOD + * messages. Use SIMD16 instead. + */ + return (devinfo->gen == 4 ? 16 : inst->exec_size); + default: return inst->exec_size; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] i965/fs: Lower SHADER_OPCODE_TXF_UMS/MCS_LOGICAL too on Gen7+.
These weren't being handled by emit_texture_gen7() but we can easily lower them here for consistency with other texturing opcodes. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 06cfc97..7387ca5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3498,12 +3498,18 @@ lower_sampler_logical_send_gen7(const fs_builder &bld, fs_inst *inst, opcode op, coordinate_done = true; break; case SHADER_OPCODE_TXF_CMS: - bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), sample_index); - length++; + case SHADER_OPCODE_TXF_UMS: + case SHADER_OPCODE_TXF_MCS: + if (op == SHADER_OPCODE_TXF_UMS || op == SHADER_OPCODE_TXF_CMS) { + bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), sample_index); + length++; + } - /* Data from the multisample control surface. */ - bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), mcs); - length++; + if (op == SHADER_OPCODE_TXF_CMS) { + /* Data from the multisample control surface. */ + bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), mcs); + length++; + } /* There is no offsetting for this message; just copy in the integer * texture coordinates. -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] i965/fs: Define logical texture sampling opcodes.
Each logical variant is largely equivalent to the original opcode but instead of taking a single payload source it expects the arguments separately as individual sources, like: tex_logical dst, coordinates, shadow_c, lod, lod2, sample_index, mcs, sampler, offset, num_coordinate_components, num_grad_components This patch defines the opcodes and usual instruction boilerplate, including a placeholder lowering function provided mostly as documentation for their source registers. --- src/mesa/drivers/dri/i965/brw_defines.h | 12 + src/mesa/drivers/dri/i965/brw_fs.cpp | 92 src/mesa/drivers/dri/i965/brw_shader.cpp | 25 + 3 files changed, 129 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 9099676..193fcbe 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -890,17 +890,29 @@ enum opcode { SHADER_OPCODE_COS, SHADER_OPCODE_TEX, + SHADER_OPCODE_TEX_LOGICAL, SHADER_OPCODE_TXD, + SHADER_OPCODE_TXD_LOGICAL, SHADER_OPCODE_TXF, + SHADER_OPCODE_TXF_LOGICAL, SHADER_OPCODE_TXL, + SHADER_OPCODE_TXL_LOGICAL, SHADER_OPCODE_TXS, + SHADER_OPCODE_TXS_LOGICAL, FS_OPCODE_TXB, + FS_OPCODE_TXB_LOGICAL, SHADER_OPCODE_TXF_CMS, + SHADER_OPCODE_TXF_CMS_LOGICAL, SHADER_OPCODE_TXF_UMS, + SHADER_OPCODE_TXF_UMS_LOGICAL, SHADER_OPCODE_TXF_MCS, + SHADER_OPCODE_TXF_MCS_LOGICAL, SHADER_OPCODE_LOD, + SHADER_OPCODE_LOD_LOGICAL, SHADER_OPCODE_TG4, + SHADER_OPCODE_TG4_LOGICAL, SHADER_OPCODE_TG4_OFFSET, + SHADER_OPCODE_TG4_OFFSET_LOGICAL, /** * Combines multiple sources of size 1 into a larger virtual GRF. diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 503d4d8..6afb9fe 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -711,6 +711,31 @@ fs_inst::regs_read(int arg) const components = src[6].fixed_hw_reg.dw1.ud; break; + case SHADER_OPCODE_TEX_LOGICAL: + case SHADER_OPCODE_TXD_LOGICAL: + case SHADER_OPCODE_TXF_LOGICAL: + case SHADER_OPCODE_TXL_LOGICAL: + case SHADER_OPCODE_TXS_LOGICAL: + case FS_OPCODE_TXB_LOGICAL: + case SHADER_OPCODE_TXF_CMS_LOGICAL: + case SHADER_OPCODE_TXF_UMS_LOGICAL: + case SHADER_OPCODE_TXF_MCS_LOGICAL: + case SHADER_OPCODE_LOD_LOGICAL: + case SHADER_OPCODE_TG4_LOGICAL: + case SHADER_OPCODE_TG4_OFFSET_LOGICAL: + assert(src[8].file == IMM && src[9].file == IMM); + /* Texture coordinates. */ + if (arg == 0) + components = src[8].fixed_hw_reg.dw1.ud; + /* Texture derivatives/LOD. */ + else if (arg == 2 || arg == 3) + components = (opcode == SHADER_OPCODE_TXD_LOGICAL ? + src[9].fixed_hw_reg.dw1.ud : 1); + /* Texture offset. */ + else if (arg == 7) + components = 2; + break; + default: if (is_tex() && arg == 0 && src[0].file == GRF) return mlen; @@ -3344,6 +3369,25 @@ lower_fb_write_logical_send(const fs_builder &bld, fs_inst *inst, inst->header_size = header_size; } +static void +lower_sampler_logical_send(const fs_builder &bld, fs_inst *inst, opcode op) +{ + const brw_device_info *devinfo = bld.shader->devinfo; + const fs_reg &coordinate = inst->src[0]; + const fs_reg &shadow_c = inst->src[1]; + const fs_reg &lod = inst->src[2]; + const fs_reg &lod2 = inst->src[3]; + const fs_reg &sample_index = inst->src[4]; + const fs_reg &mcs = inst->src[5]; + const fs_reg &sampler = inst->src[6]; + const fs_reg &offset_value = inst->src[7]; + assert(inst->src[8].file == IMM && inst->src[9].file == IMM); + const unsigned coord_components = inst->src[8].fixed_hw_reg.dw1.ud; + const unsigned grad_components = inst->src[9].fixed_hw_reg.dw1.ud; + + assert(!"Not implemented"); +} + bool fs_visitor::lower_logical_sends() { @@ -3363,6 +3407,54 @@ fs_visitor::lower_logical_sends() payload); break; + case SHADER_OPCODE_TEX_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_OPCODE_TEX); + break; + + case SHADER_OPCODE_TXD_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_OPCODE_TXD); + break; + + case SHADER_OPCODE_TXF_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_OPCODE_TXF); + break; + + case SHADER_OPCODE_TXL_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_OPCODE_TXL); + break; + + case SHADER_OPCODE_TXS_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_OPCODE_TXS); + break; + + case FS_OPCODE_TXB_LOGICAL: + lower_sampler_logical_send(ibld, inst, FS_OPCODE_TXB); + break; + + case SHADER_OPCODE_TXF_CMS_LOGICAL: + lower_sampler_logical_send(ibld, inst, SHADER_
[Mesa-dev] [PATCH 05/12] i965/fs: Implement lowering of logical texturing opcodes on Gen7+.
This should be largely equivalent to emit_texture_gen7() except that we now get i965 sampling opcodes directly rather than ir_texture_opcode enum values. The mapping is as follows: - ir_tex -> SHADER_OPCODE_TEX - ir_txb -> FS_OPCODE_TXB - ir_txl -> SHADER_OPCODE_TXL - ir_txd -> SHADER_OPCODE_TXD - ir_txf -> SHADER_OPCODE_TXF - ir_txf_ms -> SHADER_OPCODE_TXF_CMS - ir_txs -> SHADER_OPCODE_TXS - ir_query_levels -> SHADER_OPCODE_TXS too, the visitor will make sure that the provided lod value is zero in this case. - ir_lod -> SHADER_OPCODE_LOD - ir_tg4 -> SHADER_OPCODE_TG4_OFFSET if the offset value is not immediate, SHADER_OPCODE_TG4 otherwise. Other than that there are only minor changes and style fixes like the implementation now being factored out in static functions to improve encapsulation. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 217 ++- 1 file changed, 216 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index c31a0e1..06cfc97 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3369,6 +3369,214 @@ lower_fb_write_logical_send(const fs_builder &bld, fs_inst *inst, inst->header_size = header_size; } +static bool +is_high_sampler(const struct brw_device_info *devinfo, const fs_reg &sampler) +{ + if (devinfo->gen < 8 && !devinfo->is_haswell) + return false; + + return sampler.file != IMM || sampler.fixed_hw_reg.dw1.ud >= 16; +} + +static void +lower_sampler_logical_send_gen7(const fs_builder &bld, fs_inst *inst, opcode op, +fs_reg coordinate, +const fs_reg &shadow_c, +fs_reg lod, fs_reg lod2, +const fs_reg &sample_index, +const fs_reg &mcs, const fs_reg &sampler, +fs_reg offset_value, +unsigned coord_components, +unsigned grad_components) +{ + const brw_device_info *devinfo = bld.shader->devinfo; + int reg_width = bld.dispatch_width() / 8; + unsigned header_size = 0, length = 0; + fs_reg sources[MAX_SAMPLER_MESSAGE_SIZE]; + for (unsigned i = 0; i < ARRAY_SIZE(sources); i++) + sources[i] = bld.vgrf(BRW_REGISTER_TYPE_F); + + if (op == SHADER_OPCODE_TG4 || op == SHADER_OPCODE_TG4_OFFSET || + offset_value.file != BAD_FILE || + is_high_sampler(devinfo, sampler)) { + /* For general texture offsets (no txf workaround), we need a header to + * put them in. Note that we're only reserving space for it in the + * message payload as it will be initialized implicitly by the + * generator. + * + * TG4 needs to place its channel select in the header, for interaction + * with ARB_texture_swizzle. The sampler index is only 4-bits, so for + * larger sampler numbers we need to offset the Sampler State Pointer in + * the header. + */ + header_size = 1; + sources[0] = fs_reg(); + length++; + } + + if (shadow_c.file != BAD_FILE) { + bld.MOV(sources[length], shadow_c); + length++; + } + + bool coordinate_done = false; + + /* The sampler can only meaningfully compute LOD for fragment shader +* messages. For all other stages, we change the opcode to TXL and +* hardcode the LOD to 0. +*/ + if (bld.shader->stage != MESA_SHADER_FRAGMENT && + op == SHADER_OPCODE_TEX) { + op = SHADER_OPCODE_TXL; + lod = fs_reg(0.0f); + } + + /* Set up the LOD info */ + switch (op) { + case FS_OPCODE_TXB: + case SHADER_OPCODE_TXL: + bld.MOV(sources[length], lod); + length++; + break; + case SHADER_OPCODE_TXD: + /* TXD should have been lowered in SIMD16 mode. */ + assert(bld.dispatch_width() == 8); + + /* Load dPdx and the coordinate together: + * [hdr], [ref], x, dPdx.x, dPdy.x, y, dPdx.y, dPdy.y, z, dPdx.z, dPdy.z + */ + for (unsigned i = 0; i < coord_components; i++) { + bld.MOV(sources[length], coordinate); + coordinate = offset(coordinate, bld, 1); + length++; + + /* For cube map array, the coordinate is (u,v,r,ai) but there are + * only derivatives for (u, v, r). + */ + if (i < grad_components) { +bld.MOV(sources[length], lod); +lod = offset(lod, bld, 1); +length++; + +bld.MOV(sources[length], lod2); +lod2 = offset(lod2, bld, 1); +length++; + } + } + + coordinate_done = true; + break; + case SHADER_OPCODE_TXS: + bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), lod); + length++; + break; + case SHADER_OPCODE_TXF: + /* Unfortunately, the parameters for LD are intermixed: u, lo
[Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().
So that it's left uninitialized by LOAD_PAYLOAD, we only need to reserve space for it in the message since it will be initialized implicitly by the generator. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 522e13e..89fcc49 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, if (op == ir_tg4 || offset_value.file != BAD_FILE || is_high_sampler(devinfo, sampler)) { /* For general texture offsets (no txf workaround), we need a header to - * put them in. Note that for SIMD16 we're making space for two actual - * hardware registers here, so the emit will have to fix up for this. + * put them in. Note that we're only reserving space for it in the + * message payload as it will be initialized implicitly by the + * generator. * * * ir4_tg4 needs to place its channel select in the header, * for interaction with ARB_texture_swizzle @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, * need to offset the Sampler State Pointer in the header. */ header_size = 1; - sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); + sources[0] = fs_reg(); length++; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/12] i965/fs: Implement lowering of logical texturing opcodes on Gen4.
Unlike its Gen5 and Gen7 counterparts this patch isn't a plain refactor of the previous Gen4 texturing code, it's more of a rewrite largely based on emit_texture_gen4_simd16(). The reason is that on the one hand the original emit_texture_gen4() code didn't seem easily fixable to be SIMD width-invariant and had plenty of clutter to support SIMD-width workarounds which are no longer required. On the other hand emit_texture_gen4_simd16() was missing a number of SIMD8-only opcodes. This should generalize both and roughly match their current behaviour where there is overlap. Incidentally this will fix the following piglits on Gen4: arb_shader_texture_lod.execution.arb_shader_texture_lod-texgrad arb_shader_texture_lod.execution.tex-miplevel-selection *gradarb 2d arb_shader_texture_lod.execution.tex-miplevel-selection *gradarb 3d arb_shader_texture_lod.execution.tex-miplevel-selection *projgradarb 2d arb_shader_texture_lod.execution.tex-miplevel-selection *projgradarb 2d_projvec4 arb_shader_texture_lod.execution.tex-miplevel-selection *projgradarb 3d --- src/mesa/drivers/dri/i965/brw_fs.cpp | 108 ++- 1 file changed, 107 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 5233ac3..043d9e9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3370,6 +3370,110 @@ lower_fb_write_logical_send(const fs_builder &bld, fs_inst *inst, } static void +lower_sampler_logical_send_gen4(const fs_builder &bld, fs_inst *inst, opcode op, +const fs_reg &coordinate, +const fs_reg &shadow_c, +const fs_reg &lod, const fs_reg &lod2, +const fs_reg &sampler, +unsigned coord_components, +unsigned grad_components) +{ + const bool has_lod = (op == SHADER_OPCODE_TXL || op == FS_OPCODE_TXB || + op == SHADER_OPCODE_TXF || op == SHADER_OPCODE_TXS); + fs_reg msg_begin(MRF, 1, BRW_REGISTER_TYPE_F); + fs_reg msg_end = msg_begin; + + /* g0 header. */ + msg_end = offset(msg_end, bld.group(8, 0), 1); + + for (unsigned i = 0; i < coord_components; i++) + bld.MOV(retype(offset(msg_end, bld, i), coordinate.type), + offset(coordinate, bld, i)); + + msg_end = offset(msg_end, bld, coord_components); + + /* Messages other than SAMPLE and RESINFO in SIMD16 and TXD in SIMD8 +* require all three components to be present and zero if they are unused. +*/ + if (coord_components > 0 && + (has_lod || shadow_c.file != BAD_FILE || +(op == SHADER_OPCODE_TEX && bld.dispatch_width() == 8))) { + for (unsigned i = coord_components; i < 3; i++) + bld.MOV(offset(msg_end, bld, i), fs_reg(0.0f)); + + msg_end = offset(msg_end, bld, 3 - coord_components); + } + + if (op == SHADER_OPCODE_TXD) { + /* TXD unsupported in SIMD16 mode. */ + assert(bld.dispatch_width() == 8); + + /* the slots for u and v are always present, but r is optional */ + if (coord_components < 2) + msg_end = offset(msg_end, bld, 2 - coord_components); + + /* P = u, v, r + * dPdx = dudx, dvdx, drdx + * dPdy = dudy, dvdy, drdy + * + * 1-arg: Does not exist. + * + * 2-arg: dudx dvdx dudy dvdy + *dPdx.x dPdx.y dPdy.x dPdy.y + *m4 m5 m6 m7 + * + * 3-arg: dudx dvdx drdx dudy dvdy drdy + *dPdx.x dPdx.y dPdx.z dPdy.x dPdy.y dPdy.z + *m5 m6 m7 m8 m9 m10 + */ + for (unsigned i = 0; i < grad_components; i++) + bld.MOV(offset(msg_end, bld, i), offset(lod, bld, i)); + + msg_end = offset(msg_end, bld, MAX2(grad_components, 2)); + + for (unsigned i = 0; i < grad_components; i++) + bld.MOV(offset(msg_end, bld, i), offset(lod2, bld, i)); + + msg_end = offset(msg_end, bld, MAX2(grad_components, 2)); + } + + if (has_lod) { + /* Bias/LOD with shadow comparitor is unsupported in SIMD16 -- *Without* + * shadow comparitor (including RESINFO) it's unsupported in SIMD8 mode. + */ + assert(shadow_c.file != BAD_FILE ? bld.dispatch_width() == 8 : + bld.dispatch_width() == 16); + + const brw_reg_type type = + (op == SHADER_OPCODE_TXF || op == SHADER_OPCODE_TXS ? + BRW_REGISTER_TYPE_UD : BRW_REGISTER_TYPE_F); + bld.MOV(retype(msg_end, type), lod); + msg_end = offset(msg_end, bld, 1); + } + + if (shadow_c.file != BAD_FILE) { + if (op == SHADER_OPCODE_TEX && bld.dispatch_width() == 8) { + /* There's no plain shadow compare message, so we use shadow + * compare with a bias of 0.0. + */ + bld.MOV(msg_end, fs_reg(0.0f)); + msg
[Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.
--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 +--- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 89fcc49..4011639 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -861,6 +861,14 @@ fs_visitor::emit_texture(ir_texture_opcode op, } } + if (op == ir_query_levels) { + /* textureQueryLevels() is implemented in terms of TXS so we need to + * pass a valid LOD argument. + */ + assert(lod.file == BAD_FILE); + lod = fs_reg(0u); + } + if (coordinate.file != BAD_FILE) { /* FINISHME: Texture coordinate rescaling doesn't work with non-constant * samplers. This should only be a problem with GL_CLAMP on Gen7. @@ -873,26 +881,50 @@ fs_visitor::emit_texture(ir_texture_opcode op, * samples, so don't worry about them. */ fs_reg dst = vgrf(glsl_type::get_instance(dest_type->base_type, 4, 1)); + const fs_reg srcs[] = { + coordinate, shadow_c, lod, lod2, + sample_index, mcs, sampler_reg, offset_value, + fs_reg(coord_components), fs_reg(grad_components) + }; + enum opcode opcode; - if (devinfo->gen >= 7) { - inst = emit_texture_gen7(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, mcs, sampler_reg, - offset_value); - } else if (devinfo->gen >= 5) { - inst = emit_texture_gen5(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, sampler, - offset_value.file != BAD_FILE); - } else if (dispatch_width == 16) { - inst = emit_texture_gen4_simd16(op, dst, coordinate, coord_components, - shadow_c, lod, sampler); - } else { - inst = emit_texture_gen4(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sampler); + switch (op) { + case ir_tex: + opcode = SHADER_OPCODE_TEX_LOGICAL; + break; + case ir_txb: + opcode = FS_OPCODE_TXB_LOGICAL; + break; + case ir_txl: + opcode = SHADER_OPCODE_TXL_LOGICAL; + break; + case ir_txd: + opcode = SHADER_OPCODE_TXD_LOGICAL; + break; + case ir_txf: + opcode = SHADER_OPCODE_TXF_LOGICAL; + break; + case ir_txf_ms: + opcode = SHADER_OPCODE_TXF_CMS_LOGICAL; + break; + case ir_txs: + case ir_query_levels: + opcode = SHADER_OPCODE_TXS_LOGICAL; + break; + case ir_lod: + opcode = SHADER_OPCODE_LOD_LOGICAL; + break; + case ir_tg4: + opcode = (offset_value.file != BAD_FILE && offset_value.file != IMM ? +SHADER_OPCODE_TG4_OFFSET_LOGICAL : SHADER_OPCODE_TG4_LOGICAL); + break; + default: + unreachable("not reached"); } + inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs)); + inst->regs_written = 4 * dispatch_width / 8; + if (shadow_c.file != BAD_FILE) inst->shadow_compare = true; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/12] i965/fs: Remove the emit_texture_gen*() fs_visitor methods.
This is now dead code. --- src/mesa/drivers/dri/i965/brw_fs.h | 21 - src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 608 --- 2 files changed, 629 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index fed5d23..dba869c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -208,27 +208,6 @@ public: void compute_sample_position(fs_reg dst, fs_reg int_sample_pos); fs_reg rescale_texcoord(fs_reg coordinate, int coord_components, bool is_rect, uint32_t sampler, int texunit); - fs_inst *emit_texture_gen4(ir_texture_opcode op, fs_reg dst, - fs_reg coordinate, int coord_components, - fs_reg shadow_comp, - fs_reg lod, fs_reg lod2, int grad_components, - uint32_t sampler); - fs_inst *emit_texture_gen4_simd16(ir_texture_opcode op, fs_reg dst, - fs_reg coordinate, int vector_elements, - fs_reg shadow_c, fs_reg lod, - uint32_t sampler); - fs_inst *emit_texture_gen5(ir_texture_opcode op, fs_reg dst, - fs_reg coordinate, int coord_components, - fs_reg shadow_comp, - fs_reg lod, fs_reg lod2, int grad_components, - fs_reg sample_index, uint32_t sampler, - bool has_offset); - fs_inst *emit_texture_gen7(ir_texture_opcode op, fs_reg dst, - fs_reg coordinate, int coord_components, - fs_reg shadow_comp, - fs_reg lod, fs_reg lod2, int grad_components, - fs_reg sample_index, fs_reg mcs, fs_reg sampler, - fs_reg offset_value); void emit_texture(ir_texture_opcode op, const glsl_type *dest_type, fs_reg coordinate, int components, diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index c8aa494..082832e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -77,614 +77,6 @@ fs_visitor::emit_vs_system_value(int location) return reg; } -fs_inst * -fs_visitor::emit_texture_gen4(ir_texture_opcode op, fs_reg dst, - fs_reg coordinate, int coord_components, - fs_reg shadow_c, - fs_reg lod, fs_reg dPdy, int grad_components, - uint32_t sampler) -{ - int mlen; - int base_mrf = 1; - bool simd16 = false; - fs_reg orig_dst; - - /* g0 header. */ - mlen = 1; - - if (shadow_c.file != BAD_FILE) { - for (int i = 0; i < coord_components; i++) { - bld.MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate); -coordinate = offset(coordinate, bld, 1); - } - - /* gen4's SIMD8 sampler always has the slots for u,v,r present. - * the unused slots must be zeroed. - */ - for (int i = coord_components; i < 3; i++) { - bld.MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f)); - } - mlen += 3; - - if (op == ir_tex) { -/* There's no plain shadow compare message, so we use shadow - * compare with a bias of 0.0. - */ - bld.MOV(fs_reg(MRF, base_mrf + mlen), fs_reg(0.0f)); -mlen++; - } else if (op == ir_txb || op == ir_txl) { - bld.MOV(fs_reg(MRF, base_mrf + mlen), lod); -mlen++; - } else { - unreachable("Should not get here."); - } - - bld.MOV(fs_reg(MRF, base_mrf + mlen), shadow_c); - mlen++; - } else if (op == ir_tex) { - for (int i = 0; i < coord_components; i++) { - bld.MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate); -coordinate = offset(coordinate, bld, 1); - } - /* zero the others. */ - for (int i = coord_components; i<3; i++) { - bld.MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f)); - } - /* gen4's SIMD8 sampler always has the slots for u,v,r present. */ - mlen += 3; - } else if (op == ir_txd) { - fs_reg &dPdx = lod; - - for (int i = 0; i < coord_components; i++) { - bld.MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate); -coordinate = offset(coordinate, bld, 1); - } - /* the slots for u and v are always present, but r is optional */ - mlen += MAX2(coord_components, 2); - - /* P = u, v, r - * dPdx = dudx, dvdx, drdx - * dPdy = dudy, dvdy, drdy - * - * 1-arg: Does not exist. - * - * 2-arg: dudx dvdx dudy dvdy - *dPdx.x dPdx.y dPdy.x dPdy.y - *m4 m5 m6 m7 -
[Mesa-dev] [PATCH 11/12] i965/fs: Reimplement emit_mcs_fetch() in terms of logical sends.
--- src/mesa/drivers/dri/i965/brw_fs.h | 3 ++- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 36 ++-- 2 files changed, 15 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 64f89d4..fed5d23 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -243,7 +243,8 @@ public: uint32_t sampler, fs_reg sampler_reg, int texunit); - fs_reg emit_mcs_fetch(fs_reg coordinate, int components, fs_reg sampler); + fs_reg emit_mcs_fetch(const fs_reg &coordinate, unsigned components, + const fs_reg &sampler); void emit_gen6_gather_wa(uint8_t wa, fs_reg dst); void resolve_source_modifiers(fs_reg *src); void emit_discard_jump(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 4011639..c8aa494 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -797,31 +797,21 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components, /* Sample from the MCS surface attached to this multisample texture. */ fs_reg -fs_visitor::emit_mcs_fetch(fs_reg coordinate, int components, fs_reg sampler) +fs_visitor::emit_mcs_fetch(const fs_reg &coordinate, unsigned components, + const fs_reg &sampler) { - int reg_width = dispatch_width / 8; - fs_reg payload = fs_reg(GRF, alloc.allocate(components * reg_width), - BRW_REGISTER_TYPE_F); - fs_reg dest = vgrf(glsl_type::uvec4_type); - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, components); - - /* parameters are: u, v, r; missing parameters are treated as zero */ - for (int i = 0; i < components; i++) { - sources[i] = vgrf(glsl_type::float_type); - bld.MOV(retype(sources[i], BRW_REGISTER_TYPE_D), coordinate); - coordinate = offset(coordinate, bld, 1); - } - - bld.LOAD_PAYLOAD(payload, sources, components, 0); + const fs_reg dest = vgrf(glsl_type::uvec4_type); + const fs_reg srcs[] = { + coordinate, fs_reg(), fs_reg(), fs_reg(), fs_reg(), fs_reg(), + sampler, fs_reg(), fs_reg(components), fs_reg(0) + }; + fs_inst *inst = bld.emit(SHADER_OPCODE_TXF_MCS_LOGICAL, dest, srcs, +ARRAY_SIZE(srcs)); - fs_inst *inst = bld.emit(SHADER_OPCODE_TXF_MCS, dest, payload, sampler); - inst->base_mrf = -1; - inst->mlen = components * reg_width; - inst->header_size = 0; - inst->regs_written = 4 * reg_width; /* we only care about one reg of -* response, but the sampler always -* writes 4/8 -*/ + /* We only care about one reg of response, but the sampler always writes +* 4/8. +*/ + inst->regs_written = 4 * dispatch_width / 8; return dest; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nouveau: use bool instead of boolean
On 17/07/2015 23:08, Ilia Mirkin wrote: On Fri, Jul 17, 2015 at 5:02 PM, Emil Velikov wrote: On 16/07/15 22:39, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 2 +- .../drivers/nouveau/codegen/nv50_ir_driver.h | 14 +-- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 +- .../nouveau/codegen/nv50_ir_lowering_gm107.cpp | 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 4 +- src/gallium/drivers/nouveau/nouveau_buffer.c | 118 ++--- src/gallium/drivers/nouveau/nouveau_buffer.h | 6 +- src/gallium/drivers/nouveau/nouveau_context.h | 4 +- src/gallium/drivers/nouveau/nouveau_fence.c| 36 +++ src/gallium/drivers/nouveau/nouveau_fence.h| 14 +-- src/gallium/drivers/nouveau/nouveau_screen.c | 6 +- src/gallium/drivers/nouveau/nouveau_screen.h | 6 +- src/gallium/drivers/nouveau/nouveau_video.c| 56 +- src/gallium/drivers/nouveau/nouveau_winsys.h | 4 +- src/gallium/drivers/nouveau/nv30/nv30_clear.c | 2 +- src/gallium/drivers/nouveau/nv30/nv30_context.c| 4 +- src/gallium/drivers/nouveau/nv30/nv30_context.h| 10 +- src/gallium/drivers/nouveau/nv30/nv30_draw.c | 22 ++-- src/gallium/drivers/nouveau/nv30/nv30_fragprog.c | 6 +- src/gallium/drivers/nouveau/nv30/nv30_miptree.c| 4 +- src/gallium/drivers/nouveau/nv30/nv30_push.c | 6 +- src/gallium/drivers/nouveau/nv30/nv30_query.c | 4 +- src/gallium/drivers/nouveau/nv30/nv30_resource.c | 4 +- src/gallium/drivers/nouveau/nv30/nv30_resource.h | 2 +- src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 +- src/gallium/drivers/nouveau/nv30/nv30_state.h | 4 +- .../drivers/nouveau/nv30/nv30_state_validate.c | 8 +- src/gallium/drivers/nouveau/nv30/nv30_transfer.c | 54 +- src/gallium/drivers/nouveau/nv30/nv30_vbo.c| 26 ++--- src/gallium/drivers/nouveau/nv30/nv30_vertprog.c | 12 +-- src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 46 src/gallium/drivers/nouveau/nv30/nvfx_shader.h | 4 +- src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c | 50 - src/gallium/drivers/nouveau/nv50/nv50_blit.h | 10 +- src/gallium/drivers/nouveau/nv50/nv50_context.c| 14 +-- src/gallium/drivers/nouveau/nv50/nv50_context.h| 16 +-- src/gallium/drivers/nouveau/nv50/nv50_miptree.c| 26 ++--- src/gallium/drivers/nouveau/nv50/nv50_program.c| 18 ++-- src/gallium/drivers/nouveau/nv50/nv50_program.h| 6 +- src/gallium/drivers/nouveau/nv50/nv50_push.c | 8 +- src/gallium/drivers/nouveau/nv50/nv50_query.c | 38 +++ src/gallium/drivers/nouveau/nv50/nv50_resource.h | 6 +- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 18 ++-- src/gallium/drivers/nouveau/nv50/nv50_screen.h | 16 +-- .../drivers/nouveau/nv50/nv50_shader_state.c | 18 ++-- src/gallium/drivers/nouveau/nv50/nv50_state.c | 24 ++--- .../drivers/nouveau/nv50/nv50_state_validate.c | 14 +-- src/gallium/drivers/nouveau/nv50/nv50_stateobj.h | 6 +- src/gallium/drivers/nouveau/nv50/nv50_surface.c| 64 +-- src/gallium/drivers/nouveau/nv50/nv50_tex.c| 22 ++-- src/gallium/drivers/nouveau/nv50/nv50_vbo.c| 32 +++--- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 24 ++--- src/gallium/drivers/nouveau/nvc0/nvc0_compute.h| 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_context.c| 12 +-- src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 22 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c| 12 +-- src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 18 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_program.h| 6 +- src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 84 +++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 24 ++--- src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 16 +-- .../drivers/nouveau/nvc0/nvc0_shader_state.c | 12 +-- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 20 ++-- .../drivers/nouveau/nvc0/nvc0_state_validate.c | 20 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 8 +- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c| 54 +- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 38 +++ src/gallium/drivers/nouveau/nvc0/nvc0_transfer.c | 8 +- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 32 +++--- .../drivers/nouveau/nvc0/nvc0_vbo_translate.c | 18 ++-- src/gallium/drivers/nouveau/nvc0/nve4_compute.c| 24 ++--- .../winsys/nouveau/drm/nouveau_drm_winsys.c| 2 +- 72 files changed, 685 insertions(+), 685 deletions(-) Fwiw I'm like the idea, there is a small concern. There are some differences between boolean (char) and bool when it comes to implicit conversions an
[Mesa-dev] [Bug 91385] gallium xvmc tries to symlink non existing libraries on OpenBSD
https://bugs.freedesktop.org/show_bug.cgi?id=91385 Bug ID: 91385 Summary: gallium xvmc tries to symlink non existing libraries on OpenBSD Product: Mesa Version: git Hardware: Other OS: OpenBSD Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: j...@openbsd.org QA Contact: mesa-dev@lists.freedesktop.org The gallium xvmc Makefile assumes the system uses Linux style library names with libfoo.so.major.minor.revision with symlinks to libfoo.so.major.minor and libfoo.so.major. On OpenBSD libtool creates libfoo.so.major.minor even if revision is specified and ld.so will find the appropriate library if libfoo.so or libfoo.so.major is dlopen'd. gmake[5]: Entering directory '/usr/users/jsg/src/mesa/src/gallium/targets/xvmc' dest_dir=//usr/X11R6/lib; \ for i in r600; do \ j=libXvMCgallium.so;\ k=libXvMC${i}.so; \ l=${k}.1.0.0; \ ln -f ${dest_dir}/${j}.1.0.0\ ${dest_dir}/${l}; \ ln -sf ${l} \ ${dest_dir}/${k}.1.0;\ ln -sf ${l} \ ${dest_dir}/${k}.1; \ ln -sf ${l} \ ${dest_dir}/${k};\ done; \ rm -f ${dest_dir}/libXvMCgallium.* ln: //usr/X11R6/lib/libXvMCgallium.so.1.0.0: No such file or directory $ find src/gallium/ -name "*XvMC*" src/gallium/targets/xvmc-softpipe/.libs/libXvMCsoftpipe.lai src/gallium/targets/xvmc-softpipe/.libs/libXvMCsoftpipe.so.1.0 src/gallium/targets/xvmc-softpipe/.libs/libXvMCsoftpipe.la src/gallium/targets/xvmc-softpipe/libXvMCsoftpipe.la src/gallium/targets/xvmc-r300/.libs/libXvMCr300.so.1.0 src/gallium/targets/xvmc-r300/.libs/libXvMCr300.lai src/gallium/targets/xvmc-r300/.libs/libXvMCr300.la src/gallium/targets/xvmc-r300/libXvMCr300.la src/gallium/targets/xvmc-r600/.libs/libXvMCr600.lai src/gallium/targets/xvmc-r600/.libs/libXvMCr600.so.1.0 src/gallium/targets/xvmc-r600/.libs/libXvMCr600.la src/gallium/targets/xvmc-r600/libXvMCr600.la src/gallium/targets/xvmc/.deps/libXvMCgallium_la-dummy.Plo src/gallium/targets/xvmc/.deps/libXvMCgallium_la-target.Plo src/gallium/targets/xvmc/.libs/libXvMCgallium_la-target.o src/gallium/targets/xvmc/.libs/libXvMCgallium.so.1.0 src/gallium/targets/xvmc/.libs/libXvMCgallium.lai src/gallium/targets/xvmc/.libs/libXvMCgallium.la src/gallium/targets/xvmc/libXvMCgallium.la src/gallium/targets/xvmc/libXvMCgallium_la-target.lo It seems the Makefile has already wrongly installed libXvMCr600.so and libXvMCr600.so.1 into the prefix and correctly installed libXvMCr600.so.1.0 when the error occurs. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91387] Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func
https://bugs.freedesktop.org/show_bug.cgi?id=91387 Bug ID: 91387 Summary: Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func Product: Mesa Version: 10.6 Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: michaeldgodf...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org >> plot(1:100) >> Mesa 10.6.1 implementation error: invalid target in >> _swrast_choose_texture_sample_func Please report at https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func Please report at https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa panic: Segmentation fault -- stopping myself... This occurs using the latest development version of Octave. I suspect (strongly) that this originates due to an error in the Octave code which handles mouse events. But, such a user error should not result in a seg fault in the user program. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] clover: add clLinkProgramm (CL 1.2)
--- src/gallium/state_trackers/clover/api/program.cpp | 35 ++ src/gallium/state_trackers/clover/core/error.hpp | 7 + src/gallium/state_trackers/clover/core/program.cpp | 4 +++ src/gallium/state_trackers/clover/core/program.hpp | 1 + .../state_trackers/clover/llvm/invocation.cpp | 2 +- 5 files changed, 48 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/program.cpp b/src/gallium/state_trackers/clover/api/program.cpp index 553bc83..7573933 100644 --- a/src/gallium/state_trackers/clover/api/program.cpp +++ b/src/gallium/state_trackers/clover/api/program.cpp @@ -238,6 +238,41 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, return e.get(); } +CLOVER_API cl_program +clLinkProgram (cl_context d_ctx, cl_uint num_devs, const cl_device_id *d_devs, + const char *p_opts, cl_uint num_progs, const cl_program *d_progs, + void (*pfn_notify) (cl_program, void *), void *user_data, + cl_int *r_errcode) try { + auto &ctx = obj(d_ctx); + auto devs = (d_devs ? objs(d_devs, num_devs) : +ref_vector(ctx.devices())); + auto opts = (p_opts ? p_opts : ""); + auto progs = objs(d_progs, num_progs); + + if ((!pfn_notify && user_data)) + throw error(CL_INVALID_VALUE); + + if (any_of([&](const device &dev) { +return !count(dev, ctx.devices()); + }, objs(d_devs, num_devs))) + throw error(CL_INVALID_DEVICE); + + auto prog = create(ctx); + try { + prog().link(devs, opts, progs); + *r_errcode = CL_SUCCESS; + } catch (link_options_error &e) { + throw; + } catch (error &e) { + *r_errcode = CL_LINK_PROGRAM_FAILURE; + } + + return ret_object(prog); +} catch (error &e) { + ret_error(r_errcode, e); + return NULL; +} + CLOVER_API cl_int clUnloadCompiler() { return CL_SUCCESS; diff --git a/src/gallium/state_trackers/clover/core/error.hpp b/src/gallium/state_trackers/clover/core/error.hpp index 4ec619c..f6c55a3 100644 --- a/src/gallium/state_trackers/clover/core/error.hpp +++ b/src/gallium/state_trackers/clover/core/error.hpp @@ -79,6 +79,13 @@ namespace clover { } }; + class link_options_error : public error { + public: + link_options_error(const std::string &what = "") : + error(CL_INVALID_LINKER_OPTIONS , what) { + } + }; + template class invalid_object_error; diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp index 4aa2622..61fb603 100644 --- a/src/gallium/state_trackers/clover/core/program.cpp +++ b/src/gallium/state_trackers/clover/core/program.cpp @@ -24,6 +24,10 @@ using namespace clover; +program::program(clover::context &ctx) : + has_source(false), context(ctx), _kernel_ref_counter(0) { +} + program::program(clover::context &ctx, const std::string &source) : has_source(true), context(ctx), _source(source), _kernel_ref_counter(0) { } diff --git a/src/gallium/state_trackers/clover/core/program.hpp b/src/gallium/state_trackers/clover/core/program.hpp index 7d86018..c24ad83 100644 --- a/src/gallium/state_trackers/clover/core/program.hpp +++ b/src/gallium/state_trackers/clover/core/program.hpp @@ -37,6 +37,7 @@ namespace clover { evals, const std::vector> &> device_range; public: + program(clover::context &ctx); program(clover::context &ctx, const std::string &source); program(clover::context &ctx, diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index d115f15..2bf7775 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -782,7 +782,7 @@ clover::link_program_llvm(const std::vector &modules, clang::CompilerInstance c; if (!create_from_arg_llvm(c, target, options, s_log)) { r_log = log; - throw error(CL_INVALID_LINKER_OPTIONS); + throw link_options_error(); } llvm::Module linked_mod("link", llvm_ctx); -- 2.5.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] clover: separate compile and link stages
--- src/gallium/state_trackers/clover/api/program.cpp | 19 +- .../state_trackers/clover/core/compiler.hpp| 7 +- src/gallium/state_trackers/clover/core/error.hpp | 7 + src/gallium/state_trackers/clover/core/program.cpp | 41 +++- src/gallium/state_trackers/clover/core/program.hpp | 4 +- .../state_trackers/clover/llvm/invocation.cpp | 270 +++-- 6 files changed, 257 insertions(+), 91 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/program.cpp b/src/gallium/state_trackers/clover/api/program.cpp index e9b1f38..553bc83 100644 --- a/src/gallium/state_trackers/clover/api/program.cpp +++ b/src/gallium/state_trackers/clover/api/program.cpp @@ -181,13 +181,20 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs, validate_build_program_common(prog, num_devs, d_devs, pfn_notify, user_data); - prog.build(devs, opts); + if (prog.has_source) { + prog.compile(devs, opts); + prog.link(devs, opts, { prog }); + } return CL_SUCCESS; } catch (error &e) { - if (e.get() == CL_INVALID_COMPILER_OPTIONS) - return CL_INVALID_BUILD_OPTIONS; - if (e.get() == CL_COMPILE_PROGRAM_FAILURE) - return CL_BUILD_PROGRAM_FAILURE; + switch (e.get()) { + case CL_INVALID_COMPILER_OPTIONS: + case CL_INVALID_LINKER_OPTIONS: + return CL_INVALID_BUILD_OPTIONS; + case CL_COMPILE_PROGRAM_FAILURE: + case CL_LINK_PROGRAM_FAILURE: + return CL_BUILD_PROGRAM_FAILURE; + } return e.get(); } @@ -224,7 +231,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs, range(header_names, num_headers), objs(d_header_progs, num_headers)); - prog.build(devs, opts, headers); + prog.compile(devs, opts, headers); return CL_SUCCESS; } catch (error &e) { diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp b/src/gallium/state_trackers/clover/core/compiler.hpp index 2076417..0d6766a 100644 --- a/src/gallium/state_trackers/clover/core/compiler.hpp +++ b/src/gallium/state_trackers/clover/core/compiler.hpp @@ -32,11 +32,16 @@ namespace clover { module compile_program_llvm(const std::string &source, const header_map &headers, - pipe_shader_ir ir, const std::string &target, const std::string &opts, std::string &r_log); + module link_program_llvm(const std::vector &modules, +enum pipe_shader_ir ir, +const std::string &target, +const std::string &opts, +std::string &r_log); + module compile_program_tgsi(const std::string &source, std::string &r_log); } diff --git a/src/gallium/state_trackers/clover/core/error.hpp b/src/gallium/state_trackers/clover/core/error.hpp index 59a5af4..4ec619c 100644 --- a/src/gallium/state_trackers/clover/core/error.hpp +++ b/src/gallium/state_trackers/clover/core/error.hpp @@ -72,6 +72,13 @@ namespace clover { } }; + class link_error : public error { + public: + link_error(const std::string &what = "") : + error(CL_LINK_PROGRAM_FAILURE , what) { + } + }; + template class invalid_object_error; diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp index 6eebd9c..4aa2622 100644 --- a/src/gallium/state_trackers/clover/core/program.cpp +++ b/src/gallium/state_trackers/clover/core/program.cpp @@ -40,8 +40,8 @@ program::program(clover::context &ctx, } void -program::build(const ref_vector &devs, const char *opts, - const header_map &headers) { +program::compile(const ref_vector &devs, const std::string &opts, + const header_map &headers) { if (has_source) { _devices = devs; @@ -58,9 +58,7 @@ program::build(const ref_vector &devs, const char *opts, auto module = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? compile_program_tgsi(_source, log) : compile_program_llvm(_source, headers, -dev.ir_format(), -dev.ir_target(), build_opts(dev), -log)); +dev.ir_target(), opts, log)); _binaries.insert({ &dev, module }); _logs.insert({ &dev, log }); } catch (const error &) { @@ -71,6 +69,39 @@ program::build(const ref_vector &devs, const char *opts, } } +void +program::link(const ref_vector &devs, const std::string &opts, + const ref_vector &progs) { + _devices = devs; + + for (auto &dev : devs) { + if (dev.ir_format() == PIPE_SHADER_IR_TGSI) + continue; + + const std::vector mods = m
[Mesa-dev] [PATCH 0/3 v4] add clLinkProgram
This serie add OpenCL 1.2 clLinkProgram. However, it lacks the binary type part that is mandatory for input validation and also for CL_PROGRAM_BINARY_TYPE query. This will be adressed in another serie once we agree on the way to store it. EdB (3): clover: separate compile and link stages clover: override ret_object clover: add clLinkProgramm (CL 1.2) src/gallium/state_trackers/clover/api/program.cpp | 54 - src/gallium/state_trackers/clover/api/util.hpp | 12 + .../state_trackers/clover/core/compiler.hpp| 7 +- src/gallium/state_trackers/clover/core/error.hpp | 14 ++ src/gallium/state_trackers/clover/core/program.cpp | 45 +++- src/gallium/state_trackers/clover/core/program.hpp | 5 +- .../state_trackers/clover/llvm/invocation.cpp | 270 +++-- 7 files changed, 316 insertions(+), 91 deletions(-) -- 2.5.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] clover: override ret_object
Return an API object from an intrusive reference to a Clover object, incrementing the reference count of the object. Reviewed-by: Francisco Jerez --- src/gallium/state_trackers/clover/api/util.hpp | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/state_trackers/clover/api/util.hpp b/src/gallium/state_trackers/clover/api/util.hpp index 918df61..cb80a17 100644 --- a/src/gallium/state_trackers/clover/api/util.hpp +++ b/src/gallium/state_trackers/clover/api/util.hpp @@ -61,6 +61,18 @@ namespace clover { *p = desc(v()); } } + + /// + /// Return an API object from an intrusive reference to a Clover object, + /// incrementing the reference count of the object. + /// + template + typename T::descriptor_type * + ret_object(const intrusive_ref &v) { + v().retain(); + return desc(v()); + } + } #endif -- 2.5.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: don't return NULL fence if no fence is available
It looks good. Would you push it please? Reviewed-by: Marek Olšák Marek On Fri, Jul 17, 2015 at 11:05 AM, Michel Dänzer wrote: > On 17.07.2015 06:03, Marek Olšák wrote: >> From: Marek Olšák >> >> An alternative (and ugly) solution to the current clover issue. > > How about something like this instead? (Compile tested only) > > > diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c > b/src/gallium/drivers/radeonsi/si_hw_context.c > index 08cc08e..dc8702e 100644 > --- a/src/gallium/drivers/radeonsi/si_hw_context.c > +++ b/src/gallium/drivers/radeonsi/si_hw_context.c > @@ -84,7 +84,8 @@ void si_context_gfx_flush(void *context, unsigned flags, > struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs; > struct radeon_winsys *ws = ctx->b.ws; > > - if (cs->cdw == ctx->b.initial_gfx_cs_size) { > + if (cs->cdw == ctx->b.initial_gfx_cs_size && > + (!fence || ctx->last_gfx_fence)) { > if (fence) > ws->fence_reference(fence, ctx->last_gfx_fence); > if (!(flags & RADEON_FLUSH_ASYNC)) > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: automake: replace $(RM) with rm -f
$(RM) is set to 'rm -f' by GNU make, this is not true of other versions of make and RM is not one of the macros required by POSIX. Signed-off-by: Jonathan Gray --- Makefile.am | 2 +- src/gallium/targets/dri/Makefile.am | 6 +++--- src/gallium/targets/vdpau/Makefile.am | 6 +++--- src/gallium/targets/xvmc/Makefile.am | 4 ++-- src/glsl/Makefile.am | 6 +++--- src/mesa/drivers/dri/Makefile.am | 8 6 files changed, 16 insertions(+), 16 deletions(-) diff --git a/Makefile.am b/Makefile.am index 9f49ce6..6243b4d 100644 --- a/Makefile.am +++ b/Makefile.am @@ -58,4 +58,4 @@ noinst_HEADERS = \ # We list some directories in EXTRA_DIST, but don't actually want to include # the .gitignore files in the tarball. dist-hook: - find $(distdir) -name .gitignore -exec $(RM) {} + + find $(distdir) -name .gitignore -exec rm -f {} + diff --git a/src/gallium/targets/dri/Makefile.am b/src/gallium/targets/dri/Makefile.am index 7c86ea1..e047f33 100644 --- a/src/gallium/targets/dri/Makefile.am +++ b/src/gallium/targets/dri/Makefile.am @@ -117,7 +117,7 @@ all-local: $(dri_LTLIBRARIES) clean-local: $(AM_V_GEN)link_dir=$(top_builddir)/$(LIB_DIR)/gallium; \ $(AM_V_GEN)for i in $(TARGET_DRIVERS); do \ - $(RM) $${link_dir}/$${i}_dri.so;\ + rm -f $${link_dir}/$${i}_dri.so;\ done; endif @@ -128,9 +128,9 @@ install-data-hook: ln -f $(DESTDIR)$(dridir)/gallium_dri.so\ $(DESTDIR)$(dridir)/$${i}_dri.so; \ done; \ - $(RM) $(DESTDIR)$(dridir)/gallium_dri.* + rm -f $(DESTDIR)$(dridir)/gallium_dri.* uninstall-hook: for i in $(TARGET_DRIVERS); do \ - $(RM) $(DESTDIR)$(dridir)/$${i}_dri.so; \ + rm -f $(DESTDIR)$(dridir)/$${i}_dri.so; \ done; diff --git a/src/gallium/targets/vdpau/Makefile.am b/src/gallium/targets/vdpau/Makefile.am index 7eb62c1..67d8cac 100644 --- a/src/gallium/targets/vdpau/Makefile.am +++ b/src/gallium/targets/vdpau/Makefile.am @@ -97,7 +97,7 @@ all-local: $(vdpau_LTLIBRARIES) clean-local: $(AM_V_GEN)link_dir=$(top_builddir)/$(LIB_DIR)/gallium; \ $(AM_V_GEN)for i in $(TARGET_DRIVERS); do \ - $(RM) $${link_dir}/libvdpau_$${i}.so{,.$(VDPAU_MAJOR){,.$(VDPAU_MINOR){,.0}}}; \ + rm -f $${link_dir}/libvdpau_$${i}.so{,.$(VDPAU_MAJOR){,.$(VDPAU_MINOR){,.0}}}; \ done; endif @@ -118,9 +118,9 @@ install-data-hook: ln -sf $${l}\ $${dest_dir}/$${k}; \ done; \ - $(RM) $${dest_dir}/libvdpau_gallium.* + rm -f $${dest_dir}/libvdpau_gallium.* uninstall-hook: for i in $(TARGET_DRIVERS); do \ - $(RM) $(DESTDIR)$(vdpaudir)/libvdpau_$${i}.so{,.$(VDPAU_MAJOR){,.$(VDPAU_MINOR){,.0}}}; \ + rm -f $(DESTDIR)$(vdpaudir)/libvdpau_$${i}.so{,.$(VDPAU_MAJOR){,.$(VDPAU_MINOR){,.0}}}; \ done; diff --git a/src/gallium/targets/xvmc/Makefile.am b/src/gallium/targets/xvmc/Makefile.am index b328589..28f98ce 100644 --- a/src/gallium/targets/xvmc/Makefile.am +++ b/src/gallium/targets/xvmc/Makefile.am @@ -80,9 +80,9 @@ install-data-hook: ln -sf $${l}\ $${dest_dir}/$${k}; \ done; \ - $(RM) $${dest_dir}/libXvMCgallium.* + rm -f $${dest_dir}/libXvMCgallium.* uninstall-hook: for i in $(TARGET_DRIVERS); do \ - $(RM) $(DESTDIR)$(xvmcdir)/libXvMC$${i}.so{,.$(XVMC_MAJOR){,.$(XVMC_MINOR){,.0}}}; \ + rm -f $(DESTDIR)$(xvmcdir)/libXvMC$${i}.so{,.$(XVMC_MAJOR){,.$(XVMC_MINOR){,.0}}}; \ done; diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index e87b8bb..bb7d169 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -223,11 +223,11 @@ CLEANFILES = \ $(BUILT_SOURCES) clean-local: - $(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr + rm -f -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr dist-hook: - $(RM) glcpp/tests/*.out - $(RM) glcpp/tests/subtest*/*.out + rm -f glcpp/tests/*.out + rm -f glcpp/tests/subtest*/*.out PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS) diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 7656261..044967c