Re: [Mesa-dev] [PATCH] glsl: make consistent use of DECLARE_RALLOC_CXX_OPERATORS
On Tuesday, September 30, 2014 12:14:06 AM Ilia Mirkin wrote: > Signed-off-by: Ilia Mirkin > --- > > Noticed this when investigating how ralloc worked. I'm moderately sure that > the old code was fine, but seems nicer to use the cooked known-to-work macro. Yeah, Curro fixed this to actually hook up the destructor, so we can use it now (it didn't used to). Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: relax draw api validation on ES2
Patch fixes failing test in WebGL conformance test 'point-no-attributes' when running Chrome on OpenGL ES. (Shader program may draw points using constant data in shader.) No Piglit regressions. Signed-off-by: Tapani Pälli --- src/mesa/main/api_validate.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c index 51a3d1f..9b80600 100644 --- a/src/mesa/main/api_validate.c +++ b/src/mesa/main/api_validate.c @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char *function) switch (ctx->API) { case API_OPENGLES2: - /* For ES2, we can draw if any vertex array is enabled (and we - * should always have a vertex program/shader). */ - if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current) + /* For ES2, we can draw if we have a vertex program/shader). */ + if (!ctx->VertexProgram._Current) return GL_FALSE; break; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2
On Tuesday, September 30, 2014 10:28:26 AM Tapani Pälli wrote: > Patch fixes failing test in WebGL conformance test > 'point-no-attributes' when running Chrome on OpenGL ES. > (Shader program may draw points using constant data in shader.) > > No Piglit regressions. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/api_validate.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c > index 51a3d1f..9b80600 100644 > --- a/src/mesa/main/api_validate.c > +++ b/src/mesa/main/api_validate.c > @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char > *function) > > switch (ctx->API) { > case API_OPENGLES2: > - /* For ES2, we can draw if any vertex array is enabled (and we > - * should always have a vertex program/shader). */ > - if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current) > + /* For ES2, we can draw if we have a vertex program/shader). */ > + if (!ctx->VertexProgram._Current) >return GL_FALSE; >break; Looks right to me. The git history shows that it's been this way since it was written 5 years ago, and I see no comments, git commit explanations, or spec text saying why it should be like this. Using constant data seems totally reasonable, and we allow it on GL. Thanks, Tapani. Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.
Write-back caching cannot be used for buffers being scanned out by the display engine; surfaces used for scan-out must be write-through or uncached. I originally chose WT for render targets because it works in all cases. However, we really want to use write-back caching where possible, as it is more efficient. Most renderbuffers are not used for scanout - off-screen FBOs certainly are fine, and non-pageflipped backbuffers should be fine as well. So in most cases WB will work. However, we don't know what will be used for scan-out, so we instead simply use the PTE value specified by the kernel, as it knows these things. This matches our MOCS choice on Haswell. Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5 in a microbenchmark (spotted by Eero Tamminen). Improves performance in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a Broadwell GT2. Signed-off-by: Kenneth Graunke Reported-by: Eero Tamminen Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Cc'd to stable because it's a pretty trivial change and provides a sizable boost to performance on new hardware. diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 40eb2ea..6dd343f 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, horizontal_alignment(mt) | surface_tiling_mode(tiling); - surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; + surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) | SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT); -- 2.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Add a BRW_MOCS_PTE #define.
Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all three caches (L3, LLC, and eLLC where available), but leaves the LLC caching mode up to the kernel's page table entry. This allows the kernel to pick WB/WT/UC based on whether it's using a buffer for scanout. Signed-off-by: Kenneth Graunke Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/brw_defines.h | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) Cc'd to stable because it's required by the next patch. diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 2faebe8..5d09409 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -2386,8 +2386,12 @@ enum brw_wm_barycentric_interp_mode { #define HSW_MOCS_WB_LLC_WB_ELLC (2 << 1) #define HSW_MOCS_UC_LLC_WB_ELLC (3 << 1) -/* Broadwell: write-back or write-through; always use all the caches. */ -#define BDW_MOCS_WB 0x78 -#define BDW_MOCS_WT 0x58 +/* Broadwell: these defines always use all available caches (L3, LLC, eLLC), + * and let you force write-back (WB) or write-through (WT) caching, or leave + * it up to the page table entry (PTE) specified by the kernel. + */ +#define BDW_MOCS_WB 0x78 +#define BDW_MOCS_WT 0x58 +#define BDW_MOCS_PTE 0x18 #endif -- 2.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.
On Tue, Sep 30, 2014 at 01:15:56AM -0700, Kenneth Graunke wrote: > Write-back caching cannot be used for buffers being scanned out by the > display engine; surfaces used for scan-out must be write-through or > uncached. I originally chose WT for render targets because it works in > all cases. However, we really want to use write-back caching where > possible, as it is more efficient. > > Most renderbuffers are not used for scanout - off-screen FBOs certainly > are fine, and non-pageflipped backbuffers should be fine as well. So > in most cases WB will work. However, we don't know what will be used > for scan-out, so we instead simply use the PTE value specified by the > kernel, as it knows these things. > > This matches our MOCS choice on Haswell. > > Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5 > in a microbenchmark (spotted by Eero Tamminen). Improves performance > in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a > Broadwell GT2. > > Signed-off-by: Kenneth Graunke > Reported-by: Eero Tamminen > Cc: mesa-sta...@lists.freedesktop.org > --- > src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Cc'd to stable because it's a pretty trivial change and provides a sizable > boost to performance on new hardware. Both patches are Reviewed-by: Daniel Vetter Aside: Not using WT on display can lead to corruption (apparently bdw is fairly aggressive with writeback so hard to spot in reality), so imo definitely stable material. With the hw display crc stuff we now support in the kernel/igt we could even write an automated testcase for these corruptions, but probably not worth the hassle. -Daniel > > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c > b/src/mesa/drivers/dri/i965/gen8_surface_state.c > index 40eb2ea..6dd343f 100644 > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c > @@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, > horizontal_alignment(mt) | > surface_tiling_mode(tiling); > > - surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; > + surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; > > surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) | > SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT); > -- > 2.1.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: Fix tiling mode index for stencil resources
For the series: Reviewed-by: Marek Olšák Marek On Tue, Sep 30, 2014 at 5:58 AM, Michel Dänzer wrote: > From: Michel Dänzer > > We are currently only dealing with depth-only or stencil-only resources > here, not with resources having both depth and stencil[0]. In both cases, > the tiling mode index is in the tile_mode field, not in the > stencil_tile_mode field. > > [0] Add an assertion for that. > > Signed-off-by: Michel Dänzer > --- > src/gallium/drivers/radeonsi/si_dma.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_dma.c > b/src/gallium/drivers/radeonsi/si_dma.c > index c067cd9..cd6ff4a 100644 > --- a/src/gallium/drivers/radeonsi/si_dma.c > +++ b/src/gallium/drivers/radeonsi/si_dma.c > @@ -162,6 +162,8 @@ static void si_dma_copy_tile(struct si_context *ctx, > tiled_y = detile ? src_y : dst_y; > tiled_z = detile ? src_z : dst_z; > > + > assert(!util_format_is_depth_and_stencil(rtiled->resource.b.b.format)); > + > array_mode = si_array_mode(rtiled->surface.level[tiled_lvl].mode); > slice_tile_max = (rtiled->surface.level[tiled_lvl].nblk_x * > rtiled->surface.level[tiled_lvl].nblk_y) / (8*8) - > 1; > @@ -179,8 +181,7 @@ static void si_dma_copy_tile(struct si_context *ctx, > bank_w = cik_bank_wh(rtiled->surface.bankw); > mt_aspect = cik_macro_tile_aspect(rtiled->surface.mtilea); > tile_split = cik_tile_split(rtiled->surface.tile_split); > - tile_mode_index = si_tile_mode_index(rtiled, tiled_lvl, > - > util_format_has_stencil(util_format_description(rtiled->resource.b.b.format))); > + tile_mode_index = si_tile_mode_index(rtiled, tiled_lvl, false); > nbanks = si_num_banks(sscreen, rtiled); > base += rtiled->resource.gpu_address; > addr += rlinear->resource.gpu_address; > -- > 2.1.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH libdrm] radeon: Always multiply pitch_bytes by nsamples, not by slice_pt
Reviewed-by: Marek Olšák Marek On Tue, Sep 30, 2014 at 5:58 AM, Michel Dänzer wrote: > From: Michel Dänzer > > slice_pt is tileb[0] / tile_split, which isn't directly related to the > pitch. > > This caused pitch_bytes to be too large in some cases. > > [0] Tile size in bytes > > Signed-off-by: Michel Dänzer > --- > radeon/radeon_surface.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/radeon/radeon_surface.c b/radeon/radeon_surface.c > index 0723425..930017e 100644 > --- a/radeon/radeon_surface.c > +++ b/radeon/radeon_surface.c > @@ -595,7 +595,7 @@ static void eg_surf_minify(struct radeon_surface *surf, > mtile_ps = (mtile_pr * surflevel->nblk_y) / mtileh; > > surflevel->offset = offset; > -surflevel->pitch_bytes = surflevel->nblk_x * bpe * slice_pt; > +surflevel->pitch_bytes = surflevel->nblk_x * bpe * surf->nsamples; > surflevel->slice_size = mtile_ps * mtileb * slice_pt; > > surf->bo_size = offset + surflevel->slice_size * surflevel->nblk_z * > surf->array_size; > @@ -1498,7 +1498,7 @@ static void si_surf_minify_2d(struct radeon_surface > *surf, > /* macro tile per slice */ > mtile_ps = (mtile_pr * surflevel->nblk_y) / yalign; > surflevel->offset = offset; > -surflevel->pitch_bytes = surflevel->nblk_x * bpe * slice_pt; > +surflevel->pitch_bytes = surflevel->nblk_x * bpe * surf->nsamples; > surflevel->slice_size = mtile_ps * mtileb * slice_pt; > > surf->bo_size = offset + surflevel->slice_size * surflevel->nblk_z * > surf->array_size; > -- > 2.1.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88
https://bugs.freedesktop.org/show_bug.cgi?id=84242 --- Comment #10 from David Kredba --- The same result with gcc 5.0 svn rev. 215679. .eh_frame_hdr table[5712] FDE at 00c45788 overlaps table[5713] FDE at 00c45684. Now I will try older binutils with two gcc versions used before. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees
I just noticed that we should add: index a4fe2bd..ca53eb8 100644 --- a/src/glsl/opt_minmax.cpp +++ b/src/glsl/opt_minmax.cpp @@ -415,6 +415,17 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, minmax_range baserange) } } + /* If we got here we could not discard any of the operands of the minmax +* expression, but we can still try to resolve the expression if both +* operands are constant. We do this after the loop above, to make sure +* that if our operands are minmax expressions we have tried to prune them +* first (hopefully reducing them to constants). +*/ + ir_constant *a = expr->operands[0]->as_constant(); + ir_constant *b = expr->operands[1]->as_constant(); + if (a && b) + return combine_constant(ismin, a, b); + return expr; } at the bottom of prune_expression. This makes sure that when we prune the operands of a minmax expression to constants, we also resolve the parent expression to a constant, otherwise we will leave the parent with two constant arguments. I noticed this while reworking the unit tests for mixed vectors. Connor: if you give the okay to this change I will squash it in before pushing. Iago On lun, 2014-09-29 at 13:19 -0400, Connor Abbott wrote: > On Mon, Sep 29, 2014 at 7:49 AM, Iago Toral Quiroga wrote: > > Original patch by Petri Latvala : > > > > Add an optimization pass that drops min/max expression operands that > > can be proven to not contribute to the final result. The algorithm is > > similar to alpha-beta pruning on a minmax search, from the field of > > AI. > > > > This optimization pass can optimize min/max expressions where operands > > are min/max expressions. Such code can appear in shaders by itself, or > > as the result of clamp() or AMD_shader_trinary_minmax functions. > > > > This optimization pass improves the generated code for piglit's > > AMD_shader_trinary_minmax tests as follows: > > > > total instructions in shared programs: 75 -> 67 (-10.67%) > > instructions in affected programs: 60 -> 52 (-13.33%) > > GAINED:0 > > LOST: 0 > > > > All tests (max3, min3, mid3) improved. > > > > A full shader-db run: > > > > total instructions in shared programs: 4293603 -> 4293575 (-0.00%) > > instructions in affected programs: 1188 -> 1160 (-2.36%) > > GAINED:0 > > LOST: 0 > > > > Improvements happen in Guacamelee and Serious Sam 3. One shader from > > Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of > > dropping of a (constant float (0.0)) operand, which was > > compiled to a saturate modifier. > > > > Version 2 by Iago Toral Quiroga : > > > > Changes from review feedback: > > - Squashed various cosmetic changes sent by Matt Turner. > > - Make less_all_components return an enum rather than setting a class > > member. > > (Suggested by Mat Turner). Also, renamed it to compare_components. > > - Make less_all_components, smaller_constant and larger_constant static. > > (Suggested by Mat Turner) > > - Change mixmax_range to call its limits "low" and "high" instead of > > "range[0]" and "range[1]". (Suggested by Connor Abbot). > > - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by > > Connor Abbot). > > - Make the logic more clearer by rearrenging the code and commenting. > > (Suggested by Connor Abbot). > > - Added comment to explain why we need to recurse twice. (Suggested by > > Connor Abbot). > > - If we cannot prune an expression, do not return early. Instead, attempt > > to prune its children. (Suggested by Connor Abbot). > > > > Other changes: > > - Instead of having a global "valid" visitor member, let the various > > functions > > that can determine this status return a boolean and check for its value > > to decide what to do in each case. This is more flexible and allows to > > recurse into children of parents that could not be prunned due to invalid > > ranges (so related to the last bullet in the review feedback). > > - Make sure we always check if a range is valid before working with it. > > Since > > any use of get_range, combine_range or range_intersection can invalidate > > a range we should check for this situation every time we use any of these > > functions. > > > > Version 3 by Iago Toral Quiroga : > > > > Changes from review feedback: > > - Now we can make get_range, combine_range and range_intersection static too > > (suggested by Connor Abbot). > > - Do not return NULL when looking for the larger or greater constant into > > mixed vector constants. Instead, produce a new constant by doing a > > component-wise minmax. With this we can also remove of the validations > > when > > we call into these functions (suggested by Connor Abbot). > > - Add a comment explaining the meaning of the baserange argument in > > prune_expression (suggested by Connor Abbot). > > > > Oth
Re: [Mesa-dev] [PATCH] egl: setup screen iterator before using it
On 29.09.2014 19:07, Matt Turner wrote: > On Mon, Sep 29, 2014 at 5:08 AM, Tapani Pälli wrote: >> commit 4ed23fd broke creation of pbuffer surfaces, patch fixes >> the failure, noticed when running chrome with '--use-gl=egl'. > > Cc'ing JP so he can review as well. > > Reviewed-by: Matt Turner > Just to ack, we discussed this with Tapani last night and the patch was already committed as Reviewed-by: Juha-Pekka Heikkila ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] SandyBridge's 'resinfo' -> returned value for SURFTYPE_BUFFER?
Hello, I am looking at bug 57439 [0] where it shows an error in a piglit test [1] related to textureSize() function happening in Intel SandyBridge hardware. According to SNB's PRM documentation (vol4 part1 page 141), the returned value for SURFTYPE_BUFFER (the surface type used in the test) is not defined in the 'resinfo' message type. For IvyBridge's doc it is defined as the buffer size, which is calculated from combined Depth/Height/Width values. As it is not clear that SNB returns the same value than IVB for that kind of message and surface type, I send this email here asking for a clarification :-) Best regards, Sam [0] https://bugs.freedesktop.org/show_bug.cgi?id=57439 [1] ./bin/textureSize 140 fs samplerBuffer -auto -fbo signature.asc Description: Digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88
https://bugs.freedesktop.org/show_bug.cgi?id=84242 --- Comment #11 from David Kredba --- With Gentoo vanilla binutils 2.24-r3 with two slim LTO patches and the patch referred by Emil Velikov in Comment #3 https://projects.archlinux.org/svntogit/packages.git/plain/trunk/binutils-2.24-shared-pie.patch?h=packages/binutils&id=47bdd59a9967ee8dd2bcc47797855185c6471546 it builds fine even with LTO enabled (using a trick with calling configure with LTO turned off and then -fno-lto -fno-use-linker-plugin removed from each Makefile). So trunk binutils seems to be source of the problem. I have to start with bisecting them. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on
https://bugs.freedesktop.org/show_bug.cgi?id=81680 Marek Olšák changed: What|Removed |Added Attachment #105815|0 |1 is obsolete|| --- Comment #40 from Marek Olšák --- Created attachment 107124 --> https://bugs.freedesktop.org/attachment.cgi?id=107124&action=edit possible fix Could you please test this patch? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] replace file specific compileroptimization withinline attibute
Hi Matt, Am Donnerstag, 25. September 2014, 09:56:42 schrieb Marc Dietrich: > Am Mittwoch, 24. September 2014, 18:35:24 schrieb Matt Turner: > > On Wed, Sep 24, 2014 at 6:25 AM, Marc Dietrich wrote: > > > Am Montag, 22. September 2014, 11:48:29 schrieb Matt Turner: > > >> We need a configure check for support for __attribute__((target)). I'm > > >> going to send a series that adds support for this (and does the check > > >> for existing attribute uses, so once that goes in you can rebase this > > >> patch on that). > > > > > > nice, but won't work with the workaround above. Pragma and attribute > > > does > > > the same so, we could check for the attribute and use the pragma > > > instead. > > > > I wonder if the best thing to do is to add target("sse4.1") in > > addition to using -msse4.1. That way, we'll retain compatibility with > > The idea of this patch was to remove per file optimization flags because > this breaks LTO. LTO will recompile all files during the final link and > apply any "high-level" compiler flags from a single file (e.g. -msse4.1) to > all files used in the linking process. I tried to find some hints how gcc handles this. Unfortunately, the gcc docs aren't very helpful [1] and I failed to construct a test case :-( I tend to say that gcc does not apply the target options in the final link to *all* files, so this problem does seems not to exist at all (I'm running lto compiled mesa on amdfam10h with no sse4.1 support and see no crashes so far). As a side note, using "-msse4.1 -fno-lto" would prevent it in any case and also be compatible with clang. Marc [1]: info gcc on -flto: When producing the final binary with `-flto', GCC only applies link-time optimizations to those files that contain bytecode. Therefore, you can mix and match object files and libraries with GIMPLE bytecodes and final object code. GCC automatically selects which files to optimize in LTO mode and which files to link without further processing. There are some code generation flags preserved by GCC when generating bytecodes, as they need to be used during the final link stage. Currently, the following options are saved into the GIMPLE bytecode files: `-fPIC', `-fcommon' and all the `-m' target flags. At link time, these options are read in and reapplied. Note that the current implementation makes no attempt to recognize conflicting values for these options. If different files have conflicting option values (e.g., one file is compiled with `-fPIC' and another isn't), the compiler simply uses the last value read from the bytecode files. It is recommended, then, that you compile all the files participating in the same link with the same options. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: move lp_jit_screen_init() call after allocation of screen object
The screen argument isn't actually used by lp_jit_screen_init() at this time, but let's move the call so that we pass a valid pointer. v2: don't leak screen if lp_jit_screen_init() fails. --- src/gallium/drivers/llvmpipe/lp_screen.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 3025322..a264f99 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -557,9 +557,6 @@ llvmpipe_create_screen(struct sw_winsys *winsys) return NULL; #endif - if (!lp_jit_screen_init(screen)) - return NULL; - #ifdef DEBUG LP_DEBUG = debug_get_flags_option("LP_DEBUG", lp_debug_flags, 0 ); #endif @@ -570,6 +567,11 @@ llvmpipe_create_screen(struct sw_winsys *winsys) if (!screen) return NULL; + if (!lp_jit_screen_init(screen)) { + FREE(screen); + return NULL; + } + screen->winsys = winsys; screen->base.destroy = llvmpipe_destroy_screen; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] llvmpipe: move lp_jit_screen_init() call after allocation of screen object
On 09/29/2014 07:46 PM, Michel Dänzer wrote: On 30.09.2014 10:45, Michel Dänzer wrote: On 30.09.2014 07:16, Brian Paul wrote: The screen argument isn't actually used by lp_jit_screen_init() at this time, I guess that's why gcc didn't warn about it? Nope, it actually does warn about it. Mea culpa for not noticing that. Yeah, I patched this after seeing the gcc warning. New, non-leaking, patch posted. Thanks, Michel. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88
https://bugs.freedesktop.org/show_bug.cgi?id=84242 --- Comment #12 from Emil Velikov --- (In reply to comment #11) > With Gentoo vanilla binutils 2.24-r3 with two slim LTO patches and the patch > referred by Emil Velikov in Comment #3 > > https://projects.archlinux.org/svntogit/packages.git/plain/trunk/binutils-2. > 24-shared-pie.patch?h=packages/ > binutils&id=47bdd59a9967ee8dd2bcc47797855185c6471546 > > it builds fine even with LTO enabled (using a trick with calling configure > with LTO turned off and then -fno-lto -fno-use-linker-plugin removed from > each Makefile). > > So trunk binutils seems to be source of the problem. > I have to start with bisecting them. Nicely done. I hope that the problem does not end up a sheep in wolf's clothing - i.e. somewhere else. Using gcc+bintutils, to compile a third library, which links to another two compiler(ish) products... there are so many things that can be happening in there. Thank you for the great initiative :) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2
On 09/30/2014 12:28 AM, Tapani Pälli wrote: > Patch fixes failing test in WebGL conformance test > 'point-no-attributes' when running Chrome on OpenGL ES. > (Shader program may draw points using constant data in shader.) > > No Piglit regressions. This sounds believable. Did you also try the ES2 or ES3 conformance suite? I could have sworn that we had a bug related to this a long time ago, and we discovered it using the conformance suite. Either way, we should get a piglit test too... I think we have a test for desktop OpenGL (maybe 3.1?), so it shouldn't be too hard to adapt that. > Signed-off-by: Tapani Pälli > --- > src/mesa/main/api_validate.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c > index 51a3d1f..9b80600 100644 > --- a/src/mesa/main/api_validate.c > +++ b/src/mesa/main/api_validate.c > @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char > *function) > > switch (ctx->API) { > case API_OPENGLES2: > - /* For ES2, we can draw if any vertex array is enabled (and we > - * should always have a vertex program/shader). */ > - if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current) > + /* For ES2, we can draw if we have a vertex program/shader). */ > + if (!ctx->VertexProgram._Current) >return GL_FALSE; >break; > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: fix CS tracing and remove excessive CS dumping
Jerome, Could you please review this? Thanks, Marek On Sat, Sep 20, 2014 at 12:26 PM, Marek Olšák wrote: > From: Marek Olšák > > --- > src/gallium/drivers/radeonsi/si_hw_context.c | 36 > ++-- > src/gallium/drivers/radeonsi/si_pipe.c | 3 ++- > src/gallium/drivers/radeonsi/si_state_draw.c | 21 > 3 files changed, 25 insertions(+), 35 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c > b/src/gallium/drivers/radeonsi/si_hw_context.c > index eaefa6a..e030c75 100644 > --- a/src/gallium/drivers/radeonsi/si_hw_context.c > +++ b/src/gallium/drivers/radeonsi/si_hw_context.c > @@ -102,20 +102,8 @@ void si_context_gfx_flush(void *context, unsigned flags, > /* force to keep tiling flags */ > flags |= RADEON_FLUSH_KEEP_TILING_FLAGS; > > -#if SI_TRACE_CS > - if (ctx->screen->b.trace_bo) { > - struct si_screen *sscreen = ctx->screen; > - unsigned i; > - > - for (i = 0; i < cs->cdw; i++) { > - fprintf(stderr, "[%4d] [%5d] 0x%08x\n", > sscreen->b.cs_count, i, cs->buf[i]); > - } > - sscreen->b.cs_count++; > - } > -#endif > - > /* Flush the CS. */ > - ctx->b.ws->cs_flush(cs, flags, fence, 0); > + ctx->b.ws->cs_flush(cs, flags, fence, ctx->screen->b.cs_count++); > ctx->b.rings.gfx.flushing = false; > > #if SI_TRACE_CS > @@ -125,7 +113,7 @@ void si_context_gfx_flush(void *context, unsigned flags, > > for (i = 0; i < 10; i++) { > usleep(5); > - if > (!ctx->ws->buffer_is_busy(sscreen->b.trace_bo->buf, RADEON_USAGE_READWRITE)) { > + if > (!ctx->b.ws->buffer_is_busy(sscreen->b.trace_bo->buf, > RADEON_USAGE_READWRITE)) { > break; > } > } > @@ -169,23 +157,3 @@ void si_begin_new_cs(struct si_context *ctx) > > ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw; > } > - > -#if SI_TRACE_CS > -void si_trace_emit(struct si_context *sctx) > -{ > - struct si_screen *sscreen = sctx->screen; > - struct radeon_winsys_cs *cs = sctx->cs; > - uint64_t va; > - > - va = sscreen->b.trace_bo->gpu_address; > - r600_context_bo_reloc(sctx, sscreen->b.trace_bo, > RADEON_USAGE_READWRITE); > - cs->buf[cs->cdw++] = PKT3(PKT3_WRITE_DATA, 4, 0); > - cs->buf[cs->cdw++] = > PKT3_WRITE_DATA_DST_SEL(PKT3_WRITE_DATA_DST_SEL_MEM_SYNC) | > - PKT3_WRITE_DATA_WR_CONFIRM | > - > PKT3_WRITE_DATA_ENGINE_SEL(PKT3_WRITE_DATA_ENGINE_SEL_ME); > - cs->buf[cs->cdw++] = va & 0xUL; > - cs->buf[cs->cdw++] = (va >> 32UL) & 0xUL; > - cs->buf[cs->cdw++] = cs->cdw; > - cs->buf[cs->cdw++] = sscreen->b.cs_count; > -} > -#endif > diff --git a/src/gallium/drivers/radeonsi/si_pipe.c > b/src/gallium/drivers/radeonsi/si_pipe.c > index 2cce5cc..cba6d98 100644 > --- a/src/gallium/drivers/radeonsi/si_pipe.c > +++ b/src/gallium/drivers/radeonsi/si_pipe.c > @@ -94,7 +94,8 @@ static struct pipe_context *si_create_context(struct > pipe_screen *screen, void * > } > > sctx->b.rings.gfx.cs = ws->cs_create(ws, RING_GFX, > si_context_gfx_flush, > -sctx, NULL); > +sctx, sscreen->b.trace_bo ? > + sscreen->b.trace_bo->cs_buf : > NULL); > sctx->b.rings.gfx.flush = si_context_gfx_flush; > > si_init_all_descriptors(sctx); > diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c > b/src/gallium/drivers/radeonsi/si_state_draw.c > index 041..a475344 100644 > --- a/src/gallium/drivers/radeonsi/si_state_draw.c > +++ b/src/gallium/drivers/radeonsi/si_state_draw.c > @@ -1025,3 +1025,24 @@ void si_draw_vbo(struct pipe_context *ctx, const > struct pipe_draw_info *info) > pipe_resource_reference(&ib.buffer, NULL); > sctx->b.num_draw_calls++; > } > + > +#if SI_TRACE_CS > +void si_trace_emit(struct si_context *sctx) > +{ > + struct si_screen *sscreen = sctx->screen; > + struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs; > + uint64_t va; > + > + va = sscreen->b.trace_bo->gpu_address; > + r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, > sscreen->b.trace_bo, > + RADEON_USAGE_READWRITE, RADEON_PRIO_MIN); > + radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0)); > + radeon_emit(cs, > PKT3_WRITE_DATA_DST_SEL(PKT3_WRITE_DATA_DST_SEL_MEM_SYNC) | > + PKT3_WRITE_DATA_WR_CONFIRM | > + > PKT3_WRITE_DATA_ENGINE_SEL(PKT3_WRITE_DATA_ENGINE_SEL_ME)); > + radeon_emit(cs, va & 0xUL); > + radeon_emit(cs, (va >> 32UL) & 0xUL); > + radeon_emit(cs,
Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits
On 09/20/2014 07:41 PM, Matt Turner wrote: > On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes wrote: >> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >> index 79d2e94..c11ad4f 100644 >> --- a/src/mesa/main/shaderapi.c >> +++ b/src/mesa/main/shaderapi.c >> @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void) >> void >> _mesa_init_shader_state(struct gl_context *ctx) >> { >> + int i; > > In context, this declaration looks odd. Move it below the two just > after this hunk? Not in core Mesa where we have to do dumb ol' C89. :( >> /* Device drivers may override these to control what kind of instructions >> * are generated by the GLSL compiler. >> */ > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 10/56] mesa: Generalize sso stage interleaving check for tess
On 09/20/2014 06:40 PM, Chris Forbes wrote: > Signed-off-by: Chris Forbes > --- > src/mesa/main/pipelineobj.c | 53 > +++-- > 1 file changed, 37 insertions(+), 16 deletions(-) > > diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c > index c902107..b91289e 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -662,6 +662,38 @@ program_stages_all_active(struct gl_pipeline_object > *pipe, > return status; > } > > +static bool > +program_stages_interleaved_illegally(struct gl_pipeline_object *pipe) const > +{ > + struct gl_shader_program *prev = NULL; > + unsigned i, j; > + > + /* Look for programs bound to stages: A -> B -> A, with > +* any intervening sequence of unrelated programs or > +* empty stages > +*/ I think this (and perhaps the next comment) are wrapped too narrow. :) > + > + for (i = 0; i < MESA_SHADER_STAGES; i++) { > + /* Empty stages anywhere in the pipe are OK */ > + if (!pipe->CurrentProgram[i]) > + continue; > + > + if (prev && pipe->CurrentProgram[i] != prev) { > + /* We've seen an A -> B transition; look at the rest of > + * the pipe to see if we ever see A again. > + */ > + for (j = i + 1; j < MESA_SHADER_STAGES; j++) { > +if (pipe->CurrentProgram[j] == prev) > + return true; > + } > + } It took me a bit to convince myself that this code is correct. I think this would be a good place for a unit test. Since this is a good clean-up for this code, I think it could also land before the reset of the series. > + > + prev = pipe->CurrentProgram[i]; > + } > + > + return false; > +} > + > extern GLboolean > _mesa_validate_program_pipeline(struct gl_context* ctx, > struct gl_pipeline_object *pipe, > @@ -714,22 +746,11 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, > * Without Tesselation, the only case where this can occur is the geometry > * shader between the fragment shader and vertex shader. > */ > - if (pipe->CurrentProgram[MESA_SHADER_GEOMETRY] > - && pipe->CurrentProgram[MESA_SHADER_FRAGMENT] > - && pipe->CurrentProgram[MESA_SHADER_VERTEX]) { > - if (pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name == > pipe->CurrentProgram[MESA_SHADER_FRAGMENT]->Name && > - pipe->CurrentProgram[MESA_SHADER_GEOMETRY]->Name != > pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name) { > - pipe->InfoLog = > -ralloc_asprintf(pipe, > -"Program %d is active for geometry stage between > " > -"two stages for which another program %d is " > -"active", > -pipe->CurrentProgram[MESA_SHADER_GEOMETRY]->Name, > -pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name); > - goto err; > - } > - > - /* XXX tess */ > + if (program_stages_interleaved_illegally(pipe)) { > + pipe->InfoLog = ralloc_strdup(pipe, "Program is active for multiple > shader" > + "stages with an intervening stage > provided" > + "by another program"); > + goto err; > } > > /* Section 2.11.11 (Shader Execution), subheading "Validation," of the > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 15/56] mesa/main: Add misc tessellation shader stuff.
On 09/20/2014 06:40 PM, Chris Forbes wrote: > From: Fabian Bieler > > --- > src/mesa/main/context.c | 6 + > src/mesa/main/mtypes.h| 3 ++- > src/mesa/main/shaderapi.c | 29 > src/mesa/main/state.c | 67 > +-- > 4 files changed, 102 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > index d9be2f5..d4190b6 100644 > --- a/src/mesa/main/context.c > +++ b/src/mesa/main/context.c > @@ -1904,6 +1904,12 @@ _mesa_valid_to_render(struct gl_context *ctx, const > char *where) > */ > (void) from_glsl_shader[MESA_SHADER_GEOMETRY]; > > + /* FINISHME: If GL_NV_tessellation_program is ever supported, the current > +* FINISHME: tessellation control and evaluation programs should > validated here. > +*/ > + (void) from_glsl_shader[GL_TESS_CONTROL_PROGRAM_NV]; > + (void) from_glsl_shader[GL_TESS_EVALUATION_PROGRAM_NV]; I think you mean MESA_. > + > if (!from_glsl_shader[MESA_SHADER_FRAGMENT]) { >if (ctx->FragmentProgram.Enabled && !ctx->FragmentProgram._Enabled) { >_mesa_error(ctx, GL_INVALID_OPERATION, > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 9088e97..9bd78e4 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2566,7 +2566,8 @@ struct gl_sl_pragmas > */ > struct gl_shader > { > - /** GL_FRAGMENT_SHADER || GL_VERTEX_SHADER || GL_GEOMETRY_SHADER_ARB. > + /** GL_FRAGMENT_SHADER || GL_VERTEX_SHADER || GL_GEOMETRY_SHADER_ARB || > +* GL_TESS_CONTROL_SHADER || GL_TESS_EVALUATION_SHADER. > * Must be the first field. > */ > GLenum Type; > diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c > index 8160062..7ef9f74 100644 > --- a/src/mesa/main/shaderapi.c > +++ b/src/mesa/main/shaderapi.c > @@ -206,6 +206,10 @@ _mesa_validate_shader_target(const struct gl_context > *ctx, GLenum type) >return ctx == NULL || ctx->Extensions.ARB_vertex_shader; > case GL_GEOMETRY_SHADER_ARB: >return ctx == NULL || _mesa_has_geometry_shaders(ctx); > + case GL_TESS_CONTROL_SHADER: > + return ctx == NULL || ctx->Extensions.ARB_tessellation_shader; > + case GL_TESS_EVALUATION_SHADER: > + return ctx == NULL || ctx->Extensions.ARB_tessellation_shader; > case GL_COMPUTE_SHADER: >return ctx == NULL || ctx->Extensions.ARB_compute_shader; > default: > @@ -423,6 +427,8 @@ detach_shader(struct gl_context *ctx, GLuint program, > GLuint shader) > /* sanity check - make sure the new list's entries are sensible */ > for (j = 0; j < shProg->NumShaders; j++) { > assert(shProg->Shaders[j]->Type == GL_VERTEX_SHADER || > + shProg->Shaders[j]->Type == GL_TESS_CONTROL_SHADER || > + shProg->Shaders[j]->Type == GL_TESS_EVALUATION_SHADER || > shProg->Shaders[j]->Type == GL_GEOMETRY_SHADER || > shProg->Shaders[j]->Type == GL_FRAGMENT_SHADER); > assert(shProg->Shaders[j]->RefCount > 0); > @@ -1041,6 +1047,12 @@ print_shader_info(const struct gl_shader_program > *shProg) > if (shProg->_LinkedShaders[MESA_SHADER_GEOMETRY]) >printf(" geom prog %u\n", >shProg->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->Id); > + if (shProg->_LinkedShaders[MESA_SHADER_TESS_CTRL]) > + printf(" tesc prog %u\n", > + shProg->_LinkedShaders[MESA_SHADER_TESS_CTRL]->Program->Id); > + if (shProg->_LinkedShaders[MESA_SHADER_TESS_EVAL]) > + printf(" tese prog %u\n", > + shProg->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->Id); > } > > > @@ -1117,6 +1129,8 @@ void > _mesa_use_program(struct gl_context *ctx, struct gl_shader_program *shProg) > { > use_shader_program(ctx, GL_VERTEX_SHADER, shProg, &ctx->Shader); > + use_shader_program(ctx, GL_TESS_CONTROL_SHADER, shProg, &ctx->Shader); > + use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, &ctx->Shader); > use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, shProg, &ctx->Shader); > use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg, &ctx->Shader); > use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, &ctx->Shader); > @@ -1959,6 +1973,21 @@ _mesa_copy_linked_program_data(gl_shader_stage type, > case MESA_SHADER_VERTEX: >dst->UsesClipDistanceOut = src->Vert.UsesClipDistance; >break; > + case MESA_SHADER_TESS_CTRL: { > + struct gl_tess_ctrl_program *dst_tcp = > + (struct gl_tess_ctrl_program *) dst; > + dst_tcp->VerticesOut = src->TessCtrl.VerticesOut; > + } > + break; > + case MESA_SHADER_TESS_EVAL: { > + struct gl_tess_eval_program *dst_tep = > + (struct gl_tess_eval_program *) dst; > + dst_tep->PrimitiveMode = src->TessEval.PrimitiveMode; > + dst_tep->Spacing = src->TessEval.Spacing; > + dst_tep->VertexOrder = src->TessEval.VertexOrder; > + dst_tep
Re: [Mesa-dev] [RFC PATCH 09/56] mesa: Allow tess stages in glUseProgramStages
On 09/20/2014 06:40 PM, Chris Forbes wrote: > --- > src/mesa/main/pipelineobj.c | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c > index 61a5785..c902107 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -243,14 +243,13 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield > stages, GLuint program) > * > * "If stages is not the special value ALL_SHADER_BITS, and has a bit > * set that is not recognized, the error INVALID_VALUE is generated." > -* > -* NOT YET SUPPORTED: > -* GL_TESS_CONTROL_SHADER_BIT > -* GL_TESS_EVALUATION_SHADER_BIT > */ > any_valid_stages = GL_VERTEX_SHADER_BIT | GL_FRAGMENT_SHADER_BIT; > if (_mesa_has_geometry_shaders(ctx)) >any_valid_stages |= GL_GEOMETRY_SHADER_BIT; > + if (ctx->Extensions.ARB_tessellation_shader) > + any_valid_stages |= GL_TESS_CONTROL_SHADER_BIT | > + GL_TESS_EVALUATION_SHADER_BIT; > > if (stages != GL_ALL_SHADER_BITS && (stages & ~any_valid_stages) != 0) { >_mesa_error(ctx, GL_INVALID_VALUE, "glUseProgramStages(Stages)"); > @@ -326,6 +325,12 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield > stages, GLuint program) > > if ((stages & GL_GEOMETRY_SHADER_BIT) != 0) >_mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER, shProg, pipe); > + > + if ((stages & GL_TESS_CONTROL_SHADER_BIT) != 0) > + _mesa_use_shader_program(ctx, GL_TESS_CONTROL_SHADER, shProg, pipe); > + > + if ((stages & GL_TESS_EVALUATION_SHADER_BIT) != 0) > + _mesa_use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, pipe); > } > > /** > @@ -723,6 +728,8 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, > pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name); > goto err; >} > + > + /* XXX tess */ Other places in Mesa use FINISHME. I haven't gotten far enough in this series to see if these are fixed, so this comment may be irrelevant. > } > > /* Section 2.11.11 (Shader Execution), subheading "Validation," of the > @@ -742,6 +749,8 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, > && pipe->CurrentProgram[MESA_SHADER_GEOMETRY]) { >pipe->InfoLog = ralloc_strdup(pipe, "Program lacks a vertex shader"); >goto err; > + > + /* XXX: tess */ > } > > /* Section 2.11.11 (Shader Execution), subheading "Validation," of the > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 16/56] mesa/program: Add misc tessellation shader support.
On 09/20/2014 06:40 PM, Chris Forbes wrote: > From: Fabian Bieler > > --- > src/mesa/program/program.c | 44 ++ > src/mesa/program/program.h | 60 > +- > 2 files changed, 103 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c > index dc030b0..d7c457a 100644 > --- a/src/mesa/program/program.c > +++ b/src/mesa/program/program.c > @@ -101,6 +101,14 @@ _mesa_init_program(struct gl_context *ctx) > _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, > NULL); > > + ctx->TessCtrlProgram.Enabled = GL_FALSE; > + _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current, > +NULL); > + > + ctx->TessEvalProgram.Enabled = GL_FALSE; > + _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current, > +NULL); > + Indentation looks off here. Mixed tabs? > /* XXX probably move this stuff */ > ctx->ATIFragmentShader.Enabled = GL_FALSE; > ctx->ATIFragmentShader.Current = ctx->Shared->DefaultFragmentShader; > @@ -120,6 +128,8 @@ _mesa_free_program_data(struct gl_context *ctx) > _mesa_reference_fragprog(ctx, &ctx->FragmentProgram.Current, NULL); > _mesa_delete_shader_cache(ctx, ctx->FragmentProgram.Cache); > _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, NULL); > + _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current, NULL); > + _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current, NULL); > > /* XXX probably move this stuff */ > if (ctx->ATIFragmentShader.Current) { > @@ -152,6 +162,12 @@ _mesa_update_default_objects_program(struct gl_context > *ctx) > _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, >ctx->Shared->DefaultGeometryProgram); > > + _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current, > + ctx->Shared->DefaultTessCtrlProgram); > + > + _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current, > + ctx->Shared->DefaultTessEvalProgram); > + > /* XXX probably move this stuff */ > if (ctx->ATIFragmentShader.Current) { >ctx->ATIFragmentShader.Current->RefCount--; > @@ -373,6 +389,16 @@ _mesa_new_program(struct gl_context *ctx, GLenum target, > GLuint id) > CALLOC_STRUCT(gl_geometry_program), > target, id); >break; > + case GL_TESS_CONTROL_PROGRAM_NV: > + prog = _mesa_init_tess_ctrl_program(ctx, > + > CALLOC_STRUCT(gl_tess_ctrl_program), > + target, id); > + break; > + case GL_TESS_EVALUATION_PROGRAM_NV: > + prog = _mesa_init_tess_eval_program(ctx, > + CALLOC_STRUCT(gl_tess_eval_program), > + target, id); > + break; > case GL_COMPUTE_PROGRAM_NV: >prog = _mesa_init_compute_program(ctx, > CALLOC_STRUCT(gl_compute_program), > @@ -590,6 +616,24 @@ _mesa_clone_program(struct gl_context *ctx, const struct > gl_program *prog) > gpc->UsesStreams = gp->UsesStreams; >} >break; > + case GL_TESS_CONTROL_PROGRAM_NV: > + { > + const struct gl_tess_ctrl_program *tcp = > gl_tess_ctrl_program_const(prog); > + struct gl_tess_ctrl_program *tcpc = gl_tess_ctrl_program(clone); > + tcpc->VerticesOut = tcp->VerticesOut; > + // XXX: tcpc->UsesBarrier = tcp->UseBarrier; This comment seems odd. None of the other places mention this missing field, and why is this field missing? > + } > + break; > + case GL_TESS_EVALUATION_PROGRAM_NV: > + { > + const struct gl_tess_eval_program *tep = > gl_tess_eval_program_const(prog); > + struct gl_tess_eval_program *tepc = gl_tess_eval_program(clone); > + tepc->PrimitiveMode = tep->PrimitiveMode; > + tepc->Spacing = tep->Spacing; > + tepc->VertexOrder = tep->VertexOrder; > + tepc->PointMode = tep->PointMode; > + } > + break; > default: >_mesa_problem(NULL, "Unexpected target in _mesa_clone_program"); > } > diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h > index dd5198a..0216e62 100644 > --- a/src/mesa/program/program.h > +++ b/src/mesa/program/program.h > @@ -148,6 +148,24 @@ _mesa_reference_geomprog(struct gl_context *ctx, > (struct gl_program *) prog); > } > > +static inline void > +_mesa_reference_tesscprog(struct gl_context *ctx, > + struct gl_tess_ctrl_program **ptr, > + struct gl_tess_ctrl_program *prog) > +{ > + _mesa_reference_program(ctx, (struct gl_program **) ptr, > + (st
[Mesa-dev] [PATCH] gallium/util: add util_bitcount64
From: Marek Olšák I'll need this in radeonsi. --- src/gallium/auxiliary/util/u_math.h | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/auxiliary/util/u_math.h b/src/gallium/auxiliary/util/u_math.h index 39bd40f..48d5c31 100644 --- a/src/gallium/auxiliary/util/u_math.h +++ b/src/gallium/auxiliary/util/u_math.h @@ -727,6 +727,14 @@ util_bitcount(unsigned n) #endif } + +static INLINE unsigned +util_bitcount64(uint64_t n) +{ + return util_bitcount(n) + util_bitcount(n >> 32); +} + + /** * Reverse bits in n * Algorithm taken from: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 17/56] mesa: Add support for UNIFORM_BLOCK_REFERENCED_BY_TESS_*_SHADER
On 09/20/2014 06:40 PM, Chris Forbes wrote: > Signed-off-by: Chris Forbes > --- > src/mesa/main/uniforms.c | 21 + > 1 file changed, 17 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c > index 0d0cbf5..ceeadf4 100644 > --- a/src/mesa/main/uniforms.c > +++ b/src/mesa/main/uniforms.c > @@ -1127,6 +1127,18 @@ _mesa_GetActiveUniformBlockiv(GLuint program, >params[0] = > shProg->UniformBlockStageIndex[MESA_SHADER_VERTEX][uniformBlockIndex] != -1; >return; > > + case GL_UNIFORM_BLOCK_REFERENCED_BY_TESS_CONTROL_SHADER: > + if (!ctx->Extensions.ARB_tessellation_shader) > + break; > + params[0] = > shProg->UniformBlockStageIndex[MESA_SHADER_TESS_CTRL][uniformBlockIndex] != > -1; > + return; > + > + case GL_UNIFORM_BLOCK_REFERENCED_BY_TESS_EVALUATION_SHADER: > + if (!ctx->Extensions.ARB_tessellation_shader) > + break; > + params[0] = > shProg->UniformBlockStageIndex[MESA_SHADER_TESS_EVAL][uniformBlockIndex] != > -1; > + return; > + > case GL_UNIFORM_BLOCK_REFERENCED_BY_GEOMETRY_SHADER: >params[0] = > shProg->UniformBlockStageIndex[MESA_SHADER_GEOMETRY][uniformBlockIndex] != -1; >return; > @@ -1136,11 +1148,12 @@ _mesa_GetActiveUniformBlockiv(GLuint program, >return; > > default: > - _mesa_error(ctx, GL_INVALID_ENUM, > - "glGetActiveUniformBlockiv(pname 0x%x (%s))", > - pname, _mesa_lookup_enum_by_nr(pname)); > - return; > + break; > } > + > + _mesa_error(ctx, GL_INVALID_ENUM, > + "glGetActiveUniformBlockiv(pname 0x%x (%s))", > + pname, _mesa_lookup_enum_by_nr(pname)); > } This last hunk seems spurious. Does some later patch depend on this? > void GLAPIENTRY > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64
Perhaps do the same thing as util_bitcount, i.e. #if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304) return __builtin_popcountll(n); #else ... #endif Perhaps the gcc version check is no longer necessary, unlikely anyone's using gcc3.3 or earlier at this point. But whatever. On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák wrote: > From: Marek Olšák > > I'll need this in radeonsi. > --- > src/gallium/auxiliary/util/u_math.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/gallium/auxiliary/util/u_math.h > b/src/gallium/auxiliary/util/u_math.h > index 39bd40f..48d5c31 100644 > --- a/src/gallium/auxiliary/util/u_math.h > +++ b/src/gallium/auxiliary/util/u_math.h > @@ -727,6 +727,14 @@ util_bitcount(unsigned n) > #endif > } > > + > +static INLINE unsigned > +util_bitcount64(uint64_t n) > +{ > + return util_bitcount(n) + util_bitcount(n >> 32); > +} > + > + > /** > * Reverse bits in n > * Algorithm taken from: > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl()
Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC. Fixes polygon stipple regression. --- src/gallium/auxiliary/tgsi/tgsi_transform.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.h b/src/gallium/auxiliary/tgsi/tgsi_transform.h index bfcdd56..921aa90 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_transform.h +++ b/src/gallium/auxiliary/tgsi/tgsi_transform.h @@ -120,7 +120,7 @@ tgsi_transform_input_decl(struct tgsi_transform_context *ctx, decl.Declaration.File = TGSI_FILE_INPUT; decl.Declaration.Interpolate = 1; decl.Declaration.Semantic = 1; - decl.Semantic.Name = TGSI_SEMANTIC_GENERIC; + decl.Semantic.Name = sem_name; decl.Semantic.Index = sem_index; decl.Range.First = decl.Range.Last = index; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 12/56] mesa: Add tessellation shader builtin varyings.
On 09/20/2014 06:40 PM, Chris Forbes wrote: > From: Fabian Bieler > > --- > src/mesa/main/mtypes.h| 15 ++- > src/mesa/program/prog_print.c | 4 > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 9e989d7..9088e97 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -239,6 +239,8 @@ typedef enum > VARYING_SLOT_VIEWPORT, /* Appears as VS or GS output */ > VARYING_SLOT_FACE, /* FS only */ > VARYING_SLOT_PNTC, /* FS only */ > + VARYING_SLOT_TESS_LEVEL_OUTER, /* Appears in both tessellation shaders. */ > + VARYING_SLOT_TESS_LEVEL_INNER, /* Appears in both tessellation shaders. */ > VARYING_SLOT_VAR0, /* First generic varying slot */ > VARYING_SLOT_MAX = VARYING_SLOT_VAR0 + MAX_VARYING > } gl_varying_slot; > @@ -275,6 +277,8 @@ typedef enum > #define VARYING_BIT_VIEWPORT BITFIELD64_BIT(VARYING_SLOT_VIEWPORT) > #define VARYING_BIT_FACE BITFIELD64_BIT(VARYING_SLOT_FACE) > #define VARYING_BIT_PNTC BITFIELD64_BIT(VARYING_SLOT_PNTC) > +#define VARYING_BIT_TESS_LEVEL_OUTER > BITFIELD64_BIT(VARYING_SLOT_TESS_LEVEL_OUTER) > +#define VARYING_BIT_TESS_LEVEL_INNER > BITFIELD64_BIT(VARYING_SLOT_TESS_LEVEL_INNER) > #define VARYING_BIT_VAR(V) BITFIELD64_BIT(VARYING_SLOT_VAR0 + (V)) > /*@}*/ > > @@ -298,6 +302,8 @@ _mesa_varying_slot_in_fs(gl_varying_slot slot) > case VARYING_SLOT_EDGE: > case VARYING_SLOT_CLIP_VERTEX: > case VARYING_SLOT_LAYER: > + case VARYING_SLOT_TESS_LEVEL_OUTER: > + case VARYING_SLOT_TESS_LEVEL_INNER: >return GL_FALSE; > default: >return GL_TRUE; > @@ -2140,7 +2146,7 @@ typedef enum > * \name Geometry shader system values > */ > /*@{*/ > - SYSTEM_VALUE_INVOCATION_ID, > + SYSTEM_VALUE_INVOCATION_ID, /**< (Also in Tessellation Control shader) */ > /*@}*/ > > /** > @@ -2153,6 +2159,13 @@ typedef enum > SYSTEM_VALUE_SAMPLE_MASK_IN, > /*@}*/ > > + /** > +* \name Tessellation Evaluation shader system values > +*/ > + /*@{*/ > + SYSTEM_VALUE_TESS_COORD, > + /*@}*/ > + This hunk and the previous hunk should get merged with the hunk in patch 19. I don't think it matters much whether they go to 19 or 19 comes here. > SYSTEM_VALUE_MAX /**< Number of values */ > } gl_system_value; > > diff --git a/src/mesa/program/prog_print.c b/src/mesa/program/prog_print.c > index 475e241..26881e8 100644 > --- a/src/mesa/program/prog_print.c > +++ b/src/mesa/program/prog_print.c > @@ -147,6 +147,8 @@ arb_input_attrib_string(GLint index, GLenum progType) >"fragment.(twenty-one)", /* VARYING_SLOT_VIEWPORT */ >"fragment.(twenty-two)", /* VARYING_SLOT_FACE */ >"fragment.(twenty-three)", /* VARYING_SLOT_PNTC */ > + "fragment.(twenty-four)", /* VARYING_SLOT_TESS_LEVEL_OUTER */ > + "fragment.(twenty-five)", /* VARYING_SLOT_TESS_LEVEL_INNER */ >"fragment.varying[0]", >"fragment.varying[1]", >"fragment.varying[2]", > @@ -272,6 +274,8 @@ arb_output_attrib_string(GLint index, GLenum progType) >"result.(twenty-one)", /* VARYING_SLOT_VIEWPORT */ >"result.(twenty-two)", /* VARYING_SLOT_FACE */ >"result.(twenty-three)", /* VARYING_SLOT_PNTC */ > + "result.(twenty-four)", /* VARYING_SLOT_TESS_LEVEL_OUTER */ > + "result.(twenty-five)", /* VARYING_SLOT_TESS_LEVEL_INNER */ >"result.varying[0]", >"result.varying[1]", >"result.varying[2]", > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 21/56] glsl: Add tessellation shader defines and built-in variables.
On 09/20/2014 06:41 PM, Chris Forbes wrote: > From: Fabian Bieler > > --- > src/glsl/builtin_variables.cpp | 62 > +- > src/glsl/glcpp/glcpp-parse.y | 3 ++ > 2 files changed, 64 insertions(+), 1 deletion(-) > > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp > index 5b6f4ae..7ba0fe8 100644 > --- a/src/glsl/builtin_variables.cpp > +++ b/src/glsl/builtin_variables.cpp > @@ -343,6 +343,8 @@ public: > void generate_constants(); > void generate_uniforms(); > void generate_vs_special_vars(); > + void generate_tcs_special_vars(); > + void generate_tes_special_vars(); > void generate_gs_special_vars(); > void generate_fs_special_vars(); > void generate_cs_special_vars(); > @@ -842,6 +844,40 @@ builtin_variable_generator::generate_vs_special_vars() > > > /** > + * Generate variables which only exist in tessellation control shaders. > + */ > +void > +builtin_variable_generator::generate_tcs_special_vars() > +{ > + add_input(-1, int_t, "gl_PatchVerticesIn"); > + add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");// XXX: or > sysval? > + add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID"); > + > + add_output(VARYING_SLOT_TESS_LEVEL_OUTER, > +array(float_t, 4), "gl_TessLevelOuter"); > + add_output(VARYING_SLOT_TESS_LEVEL_INNER, > +array(float_t, 2), "gl_TessLevelInner"); > +} > + > + > +/** > + * Generate variables which only exist in tessellation evaluation shaders. > + */ > +void > +builtin_variable_generator::generate_tes_special_vars() > +{ > + add_input(-1, int_t, "gl_PatchVerticesIn"); > + add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");// XXX: or > sysval? > + add_system_value(SYSTEM_VALUE_TESS_COORD, vec3_t, "gl_TessCoord"); > + > + add_input(VARYING_SLOT_TESS_LEVEL_OUTER, > +array(float_t, 4), "gl_TessLevelOuter"); > + add_input(VARYING_SLOT_TESS_LEVEL_INNER, > +array(float_t, 2), "gl_TessLevelInner"); > +} > + > + > +/** > * Generate variables which only exist in geometry shaders. > */ > void > @@ -964,6 +1000,9 @@ builtin_variable_generator::add_varying(int slot, const > glsl_type *type, > const char *name_as_gs_input) > { > switch (state->stage) { > + case MESA_SHADER_TESS_CTRL: > + case MESA_SHADER_TESS_EVAL: > + // XXX: is this correct? > case MESA_SHADER_GEOMETRY: >this->per_vertex_in.add_field(slot, type, name); >/* FALLTHROUGH */ > @@ -1016,13 +1055,28 @@ builtin_variable_generator::generate_varyings() >} > } > > + if (state->stage == MESA_SHADER_TESS_CTRL || > + state->stage == MESA_SHADER_TESS_EVAL) { > + const glsl_type *per_vertex_in_type = > + this->per_vertex_in.construct_interface_instance(); > + add_variable("gl_in", array(per_vertex_in_type, > state->Const.MaxPatchVertices), This looks wrong, but I believe that it is correct. Maybe add a spec quotation? /* Section 7.1 (Built-In Language Variables) of the GLSL 4.00 spec * says: * *"In the tessellation control language, built-in variables are *intrinsically declared as: * *in gl_PerVertex { *vec4 gl_Position; *float gl_PointSize; *float gl_ClipDistance[]; *} gl_in[gl_MaxPatchVertices];" */ It may also be worth adding a similar quotation to the MESA_SHADER_GEOMETRY case below. > + ir_var_shader_in, -1); > + } > if (state->stage == MESA_SHADER_GEOMETRY) { >const glsl_type *per_vertex_in_type = > this->per_vertex_in.construct_interface_instance(); >add_variable("gl_in", array(per_vertex_in_type, 0), > ir_var_shader_in, -1); > } > - if (state->stage == MESA_SHADER_VERTEX || state->stage == > MESA_SHADER_GEOMETRY) { > + if (state->stage == MESA_SHADER_TESS_CTRL) { > + const glsl_type *per_vertex_out_type = > + this->per_vertex_out.construct_interface_instance(); > + add_variable("gl_out", array(per_vertex_out_type, 0), > + ir_var_shader_out, -1); > + } > + if (state->stage == MESA_SHADER_VERTEX || > + state->stage == MESA_SHADER_TESS_EVAL || > + state->stage == MESA_SHADER_GEOMETRY) { >const glsl_type *per_vertex_out_type = > this->per_vertex_out.construct_interface_instance(); >const glsl_struct_field *fields = > per_vertex_out_type->fields.structure; > @@ -1057,6 +,12 @@ _mesa_glsl_initialize_variables(exec_list > *instructions, > case MESA_SHADER_VERTEX: >gen.generate_vs_special_vars(); >break; > + case MESA_SHADER_TESS_CTRL: > + gen.generate_tcs_special_vars(); > + break; > + case MESA_SHADER_TESS_EVAL: > + gen.generate_tes_special_vars(); > + break;
Re: [Mesa-dev] [RFC PATCH 00/56] ARB_tessellation_shader for core mesa
On 09/20/2014 06:40 PM, Chris Forbes wrote: > This series adds all the driver-independent bits for ARB_tessellation_shader. > It's not quite finished, and there are still a handful of ugly hacks to > remove, but I think it's complete enough to start getting some review > feedback. Patches 1, 2, and 4 through 11, 13, 14, 15, 18, and 20 are Reviewed-by: Ian Romanick I agree with Ken's comments about patch 3. I sent a couple comments on patches 10, 12 (that also applies to 19), 15, 16, 17, and 21. I'll try to either get more comments or more R-b out later this week. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/13] radeonsi: get fs_write_all from tgsi_shader_info directly
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 8 ++-- src/gallium/drivers/radeonsi/si_shader.h | 6 -- src/gallium/drivers/radeonsi/si_state.c | 5 + 3 files changed, 3 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 5c3efd4..e76b969 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -1438,11 +1438,6 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) tgsi_parse_token(parse); - if (parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_PROPERTY && - parse->FullToken.FullProperty.Property.PropertyName == - TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS) - shader->fs_write_all = TRUE; - if (parse->FullToken.Token.Type != TGSI_TOKEN_TYPE_DECLARATION) continue; @@ -1499,7 +1494,8 @@ static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context * bld_base) memcpy(last_args, args, sizeof(args)); /* Handle FS_COLOR0_WRITES_ALL_CBUFS. */ - if (shader->fs_write_all && shader->output[i].sid == 0 && + if (shader->selector->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0] && +shader->output[i].sid == 0 && si_shader_ctx->shader->key.ps.nr_cbufs > 1) { for (int c = 1; c < si_shader_ctx->shader->key.ps.nr_cbufs; c++) { si_llvm_init_export_args_load(bld_base, diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 8f5b431..c6026bd 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -124,11 +124,6 @@ struct si_shader_selector { /* PIPE_SHADER_[VERTEX|FRAGMENT|...] */ unsignedtype; - - /* 1 when the shader contains -* TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS, otherwise it's 0. -* Used to determine whether we need to include nr_cbufs in the key */ - unsignedfs_write_all; }; union si_shader_key { @@ -184,7 +179,6 @@ struct si_shader { unsignednparam; booluses_instanceid; - boolfs_write_all; boolvs_out_misc_write; boolvs_out_point_size; boolvs_out_edgeflag; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 0e2d6c4..eb25606 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2215,7 +2215,7 @@ static INLINE void si_shader_selector_key(struct pipe_context *ctx, key->vs.gs_used_inputs = sctx->gs_shader->current->gs_used_inputs; } } else if (sel->type == PIPE_SHADER_FRAGMENT) { - if (sel->fs_write_all) + if (sel->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0]) key->ps.nr_cbufs = sctx->framebuffer.state.nr_cbufs; key->ps.export_16bpc = sctx->framebuffer.export_16bpc; @@ -2312,9 +2312,6 @@ static void *si_create_shader_state(struct pipe_context *ctx, sel->so = state->stream_output; tgsi_scan_shader(state->tokens, &sel->info); - if (pipe_shader_type == PIPE_SHADER_FRAGMENT) - sel->fs_write_all = sel->info.color0_writes_all_cbufs; - r = si_shader_select(ctx, sel); if (r) { free(sel); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info
From: Marek Olšák Use an array of properties indexed by TGSI_PROPERTY_* definitions. --- src/gallium/auxiliary/draw/draw_gs.c | 23 - src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 15 +++--- src/gallium/auxiliary/tgsi/tgsi_scan.c | 59 ++-- src/gallium/auxiliary/tgsi/tgsi_scan.h | 6 +-- src/gallium/auxiliary/util/u_pstipple.c | 8 +--- src/gallium/drivers/llvmpipe/lp_state_fs.c | 10 +--- src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++--- src/gallium/drivers/r300/r300_fs.c | 8 +--- src/gallium/drivers/radeonsi/si_shader.c | 53 +++-- 9 files changed, 70 insertions(+), 136 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 878fcca..0c2f892 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw, tgsi_scan_shader(state->tokens, &gs->info); /* setup the defaults */ - gs->input_primitive = PIPE_PRIM_TRIANGLES; - gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP; - gs->max_output_vertices = 32; gs->max_out_prims = 0; #ifdef HAVE_LLVM @@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw, gs->vector_length = 1; } - for (i = 0; i < gs->info.num_properties; ++i) { - if (gs->info.properties[i].name == - TGSI_PROPERTY_GS_INPUT_PRIM) - gs->input_primitive = gs->info.properties[i].data[0]; - else if (gs->info.properties[i].name == - TGSI_PROPERTY_GS_OUTPUT_PRIM) - gs->output_primitive = gs->info.properties[i].data[0]; - else if (gs->info.properties[i].name == - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) - gs->max_output_vertices = gs->info.properties[i].data[0]; - } + gs->input_primitive = + gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0]; + gs->output_primitive = + gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0]; + gs->max_output_vertices = + gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; + if (!gs->max_output_vertices) + gs->max_output_vertices = 32; + /* Primitive boundary is bigger than max_output_vertices by one, because * the specification says that the geometry shader should exit if the * number of emitted vertices is bigger or equal to max_output_vertices and diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index c0bd7be..2d7f32d 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, * were forgetting so we're using MAX_VERTEX_VARYING from * that spec even though we could debug_assert if it's not * set, but that's a lot uglier. */ - uint max_output_vertices = 32; - uint i = 0; + uint max_output_vertices; + /* inputs are always indirect with gs */ bld.indirect_files |= (1 << TGSI_FILE_INPUT); bld.gs_iface = gs_iface; @@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex; bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive; - for (i = 0; i < info->num_properties; ++i) { - if (info->properties[i].name == - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) { -max_output_vertices = info->properties[i].data[0]; - } - } + max_output_vertices = +info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; + if (!max_output_vertices) + max_output_vertices = 32; + bld.max_output_vertices_vec = lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type, max_output_vertices); diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c b/src/gallium/auxiliary/tgsi/tgsi_scan.c index c71bb36..f9d1896 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c @@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens, { const struct tgsi_full_property *fullprop = &parse.FullToken.FullProperty; +unsigned name = fullprop->Property.PropertyName; -info->properties[info->num_properties].name = - fullprop->Property.PropertyName; -memcpy(info->properties[info->num_properties].data, - fullprop->u, 8 * sizeof(unsigned));; - -++info->num_properties; +assert(name < Elements(info->properties)); +memcpy(info->properties[name], + fullprop->u, 8 * sizeof(unsigned)); } break; @@ -296,35 +294,26 @@ tgsi_scan_shader(const struct tgsi_token *t
[Mesa-dev] [PATCH 04/13] tgsi: remove some not so useful variables from tgsi_shader_info
From: Marek Olšák --- src/gallium/auxiliary/tgsi/tgsi_scan.c | 8 src/gallium/auxiliary/tgsi/tgsi_scan.h | 3 --- src/gallium/drivers/llvmpipe/lp_state_fs.c | 4 +++- src/gallium/drivers/softpipe/sp_quad_blend.c | 5 ++--- src/gallium/drivers/softpipe/sp_setup.c | 12 src/gallium/drivers/svga/svga_state_fs.c | 2 +- 6 files changed, 14 insertions(+), 20 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c b/src/gallium/auxiliary/tgsi/tgsi_scan.c index f9d1896..d68dca8 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c @@ -293,14 +293,6 @@ tgsi_scan_shader(const struct tgsi_token *tokens, info->uses_kill = (info->opcode_count[TGSI_OPCODE_KILL_IF] || info->opcode_count[TGSI_OPCODE_KILL]); - /* extract simple properties */ - info->origin_lower_left = - info->properties[TGSI_PROPERTY_FS_COORD_ORIGIN][0]; - info->pixel_center_integer = - info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0]; - info->color0_writes_all_cbufs = - info->properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0]; - /* The dimensions of the IN decleration in geometry shader have * to be deduced from the type of the input primitive. */ diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 0d79e29..934acec 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -76,9 +76,6 @@ struct tgsi_shader_info boolean uses_vertexid; boolean uses_primid; boolean uses_frontface; - boolean origin_lower_left; - boolean pixel_center_integer; - boolean color0_writes_all_cbufs; boolean writes_viewport_index; boolean writes_layer; boolean is_msaa_sampler[PIPE_MAX_SAMPLERS]; diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c b/src/gallium/drivers/llvmpipe/lp_state_fs.c index 349d85a..cc75266 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c @@ -2323,6 +2323,8 @@ generate_fragment(struct llvmpipe_context *lp, LLVMValueRef mask_store = lp_build_array_alloca(gallivm, mask_type, num_loop, "mask_store"); LLVMValueRef color_store[PIPE_MAX_COLOR_BUFS][TGSI_NUM_CHANNELS]; + boolean pixel_center_integer = + shader->info.base.properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0]; /* * The shader input interpolation info is not explicitely baked in the @@ -2333,7 +2335,7 @@ generate_fragment(struct llvmpipe_context *lp, gallivm, shader->info.base.num_inputs, inputs, - shader->info.base.pixel_center_integer, + pixel_center_integer, builder, fs_type, a0_ptr, dadx_ptr, dady_ptr, x, y); diff --git a/src/gallium/drivers/softpipe/sp_quad_blend.c b/src/gallium/drivers/softpipe/sp_quad_blend.c index 6c52c90..d60e508 100644 --- a/src/gallium/drivers/softpipe/sp_quad_blend.c +++ b/src/gallium/drivers/softpipe/sp_quad_blend.c @@ -923,9 +923,8 @@ blend_fallback(struct quad_stage *qs, struct softpipe_context *softpipe = qs->softpipe; const struct pipe_blend_state *blend = softpipe->blend; unsigned cbuf; - boolean write_all; - - write_all = softpipe->fs_variant->info.color0_writes_all_cbufs; + boolean write_all = + softpipe->fs_variant->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0]; for (cbuf = 0; cbuf < softpipe->framebuffer.nr_cbufs; cbuf++) { if (softpipe->framebuffer.cbufs[cbuf]) { diff --git a/src/gallium/drivers/softpipe/sp_setup.c b/src/gallium/drivers/softpipe/sp_setup.c index 7937e10..989ed9c 100644 --- a/src/gallium/drivers/softpipe/sp_setup.c +++ b/src/gallium/drivers/softpipe/sp_setup.c @@ -562,17 +562,21 @@ static void setup_fragcoord_coeff(struct setup_context *setup, uint slot) { const struct tgsi_shader_info *fsInfo = &setup->softpipe->fs_variant->info; + boolean origin_lower_left = + fsInfo->properties[TGSI_PROPERTY_FS_COORD_ORIGIN][0]; + boolean pixel_center_integer = + fsInfo->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0]; /*X*/ - setup->coef[slot].a0[0] = fsInfo->pixel_center_integer ? 0.0f : 0.5f; + setup->coef[slot].a0[0] = pixel_center_integer ? 0.0f : 0.5f; setup->coef[slot].dadx[0] = 1.0f; setup->coef[slot].dady[0] = 0.0f; /*Y*/ setup->coef[slot].a0[1] = - (fsInfo->origin_lower_left ? setup->softpipe->framebuffer.height-1 : 0) - + (fsInfo->pixel_center_integer ? 0.0f : 0.5f); + (origin_lower_left ? setup->softpipe->framebuffer.height-1 : 0) + + (pixel_center_integer ? 0.0f :
[Mesa-dev] [PATCH 01/13] radeonsi: get tgsi_shader_info only once before compilation
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 25 +++-- src/gallium/drivers/radeonsi/si_shader.h | 2 ++ src/gallium/drivers/radeonsi/si_state.c | 10 +++--- 3 files changed, 16 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 9d2cc80..276ba81 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2805,7 +2805,6 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) { struct si_shader_selector *sel = shader->selector; struct si_shader_context si_shader_ctx; - struct tgsi_shader_info shader_info; struct lp_build_tgsi_context * bld_base; LLVMModuleRef mod; int r = 0; @@ -2826,13 +2825,11 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) radeon_llvm_context_init(&si_shader_ctx.radeon_bld); bld_base = &si_shader_ctx.radeon_bld.soa.bld_base; - tgsi_scan_shader(sel->tokens, &shader_info); - - if (shader_info.uses_kill) + if (sel->info.uses_kill) shader->db_shader_control |= S_02880C_KILL_ENABLE(1); - shader->uses_instanceid = shader_info.uses_instanceid; - bld_base->info = &shader_info; + shader->uses_instanceid = sel->info.uses_instanceid; + bld_base->info = &sel->info; bld_base->emit_fetch_funcs[TGSI_FILE_CONSTANT] = fetch_constant; bld_base->op_actions[TGSI_OPCODE_TEX] = tex_action; @@ -2876,16 +2873,16 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs; bld_base->emit_epilogue = si_llvm_emit_gs_epilogue; - for (i = 0; i < shader_info.num_properties; i++) { - switch (shader_info.properties[i].name) { + for (i = 0; i < sel->info.num_properties; i++) { + switch (sel->info.properties[i].name) { case TGSI_PROPERTY_GS_INPUT_PRIM: - shader->gs_input_prim = shader_info.properties[i].data[0]; + shader->gs_input_prim = sel->info.properties[i].data[0]; break; case TGSI_PROPERTY_GS_OUTPUT_PRIM: - shader->gs_output_prim = shader_info.properties[i].data[0]; + shader->gs_output_prim = sel->info.properties[i].data[0]; break; case TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES: - shader->gs_max_out_vertices = shader_info.properties[i].data[0]; + shader->gs_max_out_vertices = sel->info.properties[i].data[0]; break; } } @@ -2897,10 +2894,10 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) si_shader_ctx.radeon_bld.load_input = declare_input_fs; bld_base->emit_epilogue = si_llvm_emit_fs_epilogue; - for (i = 0; i < shader_info.num_properties; i++) { - switch (shader_info.properties[i].name) { + for (i = 0; i < sel->info.num_properties; i++) { + switch (sel->info.properties[i].name) { case TGSI_PROPERTY_FS_DEPTH_LAYOUT: - switch (shader_info.properties[i].data[0]) { + switch (sel->info.properties[i].data[0]) { case TGSI_FS_DEPTH_LAYOUT_GREATER: shader->db_shader_control |= S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_GREATER_THAN_Z); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index d8a63df..8f5b431 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -30,6 +30,7 @@ #define SI_SHADER_H #include /* LLVMModuleRef */ +#include "tgsi/tgsi_scan.h" #define SI_SGPR_CONST 0 #define SI_SGPR_SAMPLER2 @@ -117,6 +118,7 @@ struct si_shader_selector { struct tgsi_token *tokens; struct pipe_stream_output_info so; + struct tgsi_shader_info info; unsignednum_shaders; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index ed90f13..0e2d6c4 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -30,7 +30,6 @@ #include "radeon/r600_cs.h" #include "tgsi/tgsi_parse.h" -#include "tgsi/tgsi_scan.h" #include "util/u_format.h" #include "util/u_format_s3tc.h" #include "util/u_framebuffer.h"
[Mesa-dev] [PATCH 07/13] radeonsi: move geometry shader properties from si_shader to si_shader_selector
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 24 ++-- src/gallium/drivers/radeonsi/si_shader.h | 10 +- src/gallium/drivers/radeonsi/si_state.c | 25 +++-- src/gallium/drivers/radeonsi/si_state_draw.c | 8 4 files changed, 38 insertions(+), 29 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index c5f13be..6372ccf 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -109,7 +109,7 @@ static struct si_shader_context * si_shader_context( * less than 64, so that a 64-bit bitmask of used inputs or outputs can be * calculated. */ -static unsigned get_unique_index(unsigned semantic_name, unsigned index) +unsigned si_shader_io_get_unique_index(unsigned semantic_name, unsigned index) { switch (semantic_name) { case TGSI_SEMANTIC_POSITION: @@ -160,7 +160,7 @@ static unsigned get_unique_index(unsigned semantic_name, unsigned index) static int get_param_index(unsigned semantic_name, unsigned index, uint64_t mask) { - unsigned unique_index = get_unique_index(semantic_name, index); + unsigned unique_index = si_shader_io_get_unique_index(semantic_name, index); int i, param_index = 0; /* If not present... */ @@ -337,13 +337,6 @@ static void declare_input_gs( struct si_shader *shader = si_shader_ctx->shader; si_store_shader_io_attribs(shader, decl); - - if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID) { - shader->gs_used_inputs |= - 1llu << get_unique_index(decl->Semantic.Name, -decl->Semantic.Index); - shader->nparam++; - } } static LLVMValueRef fetch_input_gs( @@ -410,7 +403,7 @@ static LLVMValueRef fetch_input_gs( args[1] = vtx_offset; args[2] = lp_build_const_int32(gallivm, (get_param_index(input->name, input->sid, - shader->gs_used_inputs) * 4 + + shader->selector->gs_used_inputs) * 4 + swizzle) * 256); args[3] = uint->zero; args[4] = uint->one; /* OFFEN */ @@ -2304,7 +2297,7 @@ static void si_llvm_emit_vertex( */ can_emit = LLVMBuildICmp(gallivm->builder, LLVMIntULE, gs_next_vertex, lp_build_const_int32(gallivm, - shader->gs_max_out_vertices), ""); + shader->selector->gs_max_out_vertices), ""); kill = lp_build_select(&bld_base->base, can_emit, lp_build_const_float(gallivm, 1.0f), lp_build_const_float(gallivm, -1.0f)); @@ -2319,7 +2312,7 @@ static void si_llvm_emit_vertex( LLVMValueRef out_val = LLVMBuildLoad(gallivm->builder, out_ptr[chan], ""); LLVMValueRef voffset = lp_build_const_int32(gallivm, (i * 4 + chan) * - shader->gs_max_out_vertices); + shader->selector->gs_max_out_vertices); voffset = lp_build_add(uint, voffset, gs_next_vertex); voffset = lp_build_mul_imm(uint, voffset, 4); @@ -2767,7 +2760,7 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen, for (chan = 0; chan < 4; chan++) { args[2] = lp_build_const_int32(gallivm, (i * 4 + chan) * - gs->gs_max_out_vertices * 16 * 4); + gs->selector->gs_max_out_vertices * 16 * 4); outputs[i].values[chan] = LLVMBuildBitCast(gallivm->builder, @@ -2866,11 +2859,6 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) si_shader_ctx.radeon_bld.load_input = declare_input_gs; bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs; bld_base->emit_epilogue = si_llvm_emit_gs_epilogue; - - shader->gs_output_prim = - sel->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0]; - shader->gs_max_out_vertices = - sel->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; break; case TGSI_PROCESSOR_FRAGMENT: si_shader_ctx.radeon_bld.load_input = declare_input_fs; diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/rade
[Mesa-dev] [PATCH 12/13] radeonsi: pass the GS shader directly to si_generate_gs_copy_shader
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 286014c..4e8f80f 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2701,14 +2701,13 @@ int si_compile_llvm(struct si_screen *sscreen, struct si_shader *shader, /* Generate code for the hardware VS shader stage to go with a geometry shader */ static int si_generate_gs_copy_shader(struct si_screen *sscreen, struct si_shader_context *si_shader_ctx, - bool dump) + struct si_shader *gs, bool dump) { struct gallivm_state *gallivm = &si_shader_ctx->radeon_bld.gallivm; struct lp_build_tgsi_context *bld_base = &si_shader_ctx->radeon_bld.soa.bld_base; struct lp_build_context *base = &bld_base->base; struct lp_build_context *uint = &bld_base->uint_bld; struct si_shader *shader = si_shader_ctx->shader; - struct si_shader *gs = si_shader_ctx->shader->selector->current; struct si_shader_output_values *outputs; LLVMValueRef t_list_ptr, t_list; LLVMValueRef args[9]; @@ -2910,7 +2909,8 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) shader->gs_copy_shader->selector = shader->selector; shader->gs_copy_shader->key = shader->key; si_shader_ctx.shader = shader->gs_copy_shader; - if ((r = si_generate_gs_copy_shader(sscreen, &si_shader_ctx, dump))) { + if ((r = si_generate_gs_copy_shader(sscreen, &si_shader_ctx, + shader, dump))) { free(shader->gs_copy_shader); shader->gs_copy_shader = NULL; goto out; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/13] radeonsi: make the vertex shader key smaller
From: Marek Olšák We only support 16 vertex attribs, not 32. --- src/gallium/drivers/radeonsi/si_shader.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index d9a89e3..c0e5cf4 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -31,6 +31,7 @@ #include /* LLVMModuleRef */ #include "tgsi/tgsi_scan.h" +#include "si_state.h" #define SI_SGPR_CONST 0 #define SI_SGPR_SAMPLER2 @@ -140,7 +141,7 @@ union si_shader_key { unsignedalpha_to_one:1; } ps; struct { - unsignedinstance_divisors[PIPE_MAX_ATTRIBS]; + unsignedinstance_divisors[SI_NUM_VERTEX_BUFFERS]; /* The mask of "get_unique_index" bits, needed for ES, * it describes how the ES->GS ring buffer is laid out. */ uint64_tgs_used_inputs; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/13] radeonsi: set LLVMByValAttribute for all descriptor arrays
From: Marek Olšák I hope this is correct. --- src/gallium/drivers/radeonsi/si_shader.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 69382bd..286014c 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2391,7 +2391,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) struct gallivm_state *gallivm = bld_base->base.gallivm; struct si_shader *shader = si_shader_ctx->shader; LLVMTypeRef params[SI_NUM_PARAMS], f32, i8, i32, v2i32, v3i32, v16i8, v4i32, v8i32; - unsigned i, last_sgpr, num_params; + unsigned i, last_array_pointer, last_sgpr, num_params; i8 = LLVMInt8TypeInContext(gallivm->context); i32 = LLVMInt32TypeInContext(gallivm->context); @@ -2406,10 +2406,12 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_RW_BUFFERS] = const_array(v16i8, SI_NUM_RW_BUFFERS); params[SI_PARAM_SAMPLER] = const_array(v4i32, SI_NUM_SAMPLER_STATES); params[SI_PARAM_RESOURCE] = const_array(v8i32, SI_NUM_SAMPLER_VIEWS); + last_array_pointer = SI_PARAM_RESOURCE; switch (si_shader_ctx->type) { case TGSI_PROCESSOR_VERTEX: params[SI_PARAM_VERTEX_BUFFER] = const_array(v16i8, SI_NUM_VERTEX_BUFFERS); + last_array_pointer = SI_PARAM_VERTEX_BUFFER; params[SI_PARAM_BASE_VERTEX] = i32; params[SI_PARAM_START_INSTANCE] = i32; num_params = SI_PARAM_START_INSTANCE+1; @@ -2493,18 +2495,13 @@ static void create_function(struct si_shader_context *si_shader_ctx) for (i = 0; i <= last_sgpr; ++i) { LLVMValueRef P = LLVMGetParam(si_shader_ctx->radeon_bld.main_fn, i); - switch (i) { - default: - LLVMAddAttribute(P, LLVMInRegAttribute); - break; + /* We tell llvm that array inputs are passed by value to allow Sinking pass * to move load. Inputs are constant so this is fine. */ - case SI_PARAM_CONST: - case SI_PARAM_SAMPLER: - case SI_PARAM_RESOURCE: + if (i <= last_array_pointer) LLVMAddAttribute(P, LLVMByValAttribute); - break; - } + else + LLVMAddAttribute(P, LLVMInRegAttribute); } if (bld_base->info && -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/13] radeonsi: always compile shaders on demand
From: Marek Olšák The first compiled shader is sometimes useless, because the key doesn't match the key for the draw call where it's used. --- src/gallium/drivers/radeonsi/si_state.c | 16 +++- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index eb25606..da5fcb0 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2305,19 +2305,12 @@ static void *si_create_shader_state(struct pipe_context *ctx, unsigned pipe_shader_type) { struct si_shader_selector *sel = CALLOC_STRUCT(si_shader_selector); - int r; sel->type = pipe_shader_type; sel->tokens = tgsi_dup_tokens(state->tokens); sel->so = state->stream_output; tgsi_scan_shader(state->tokens, &sel->info); - r = si_shader_select(ctx, sel); - if (r) { - free(sel); - return NULL; - } - return sel; } @@ -2344,10 +2337,7 @@ static void si_bind_vs_shader(struct pipe_context *ctx, void *state) struct si_context *sctx = (struct si_context *)ctx; struct si_shader_selector *sel = state; - if (sctx->vs_shader == sel) - return; - - if (!sel || !sel->current) + if (sctx->vs_shader == sel || !sel) return; sctx->vs_shader = sel; @@ -2373,8 +2363,8 @@ static void si_bind_ps_shader(struct pipe_context *ctx, void *state) if (sctx->ps_shader == sel) return; - /* use dummy shader if supplied shader is corrupt */ - if (!sel || !sel->current) { + /* use a dummy shader if binding a NULL shader */ + if (!sel) { if (!sctx->dummy_pixel_shader) { sctx->dummy_pixel_shader = util_make_fragment_cloneinput_shader(&sctx->b.b, 0, -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/13] radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLE
From: Marek Olšák st/mesa has the same flag in its shader key, we don't need to do it in the driver anymore. Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets. --- src/gallium/drivers/radeonsi/si_shader.c | 4 ++-- src/gallium/drivers/radeonsi/si_shader.h | 1 - src/gallium/drivers/radeonsi/si_state.c | 2 -- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 6372ccf..69382bd 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -501,7 +501,7 @@ static void declare_input_fs( interp_param = 0; break; case TGSI_INTERPOLATE_LINEAR: - if (si_shader_ctx->shader->key.ps.interp_at_sample) + if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_SAMPLE) interp_param = LLVMGetParam(main_fn, SI_PARAM_LINEAR_SAMPLE); else if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_CENTROID) interp_param = LLVMGetParam(main_fn, SI_PARAM_LINEAR_CENTROID); @@ -515,7 +515,7 @@ static void declare_input_fs( } /* fall through to perspective */ case TGSI_INTERPOLATE_PERSPECTIVE: - if (si_shader_ctx->shader->key.ps.interp_at_sample) + if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_SAMPLE) interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_SAMPLE); else if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_CENTROID) interp_param = LLVMGetParam(main_fn, SI_PARAM_PERSP_CENTROID); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index c46e649..d9a89e3 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -137,7 +137,6 @@ union si_shader_key { unsignedcolor_two_side:1; unsignedalpha_func:3; unsignedflatshade:1; - unsignedinterp_at_sample:1; unsignedalpha_to_one:1; } ps; struct { diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 46dbca3..88a50f3 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2221,8 +2221,6 @@ static INLINE void si_shader_selector_key(struct pipe_context *ctx, if (sctx->queued.named.rasterizer) { key->ps.color_two_side = sctx->queued.named.rasterizer->two_side; key->ps.flatshade = sctx->queued.named.rasterizer->flatshade; - key->ps.interp_at_sample = sctx->framebuffer.nr_samples > 1 && - sctx->ps_iter_samples == sctx->framebuffer.nr_samples; if (sctx->queued.named.blend) { key->ps.alpha_to_one = sctx->queued.named.blend->alpha_to_one && -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/13] radeonsi: set number of userdata SGPRs of GS copy shader to 4
From: Marek Olšák It only needs the constant buffer with clip planes and read-write resources for the GS->VS ring and streamout. That's 2 pointers. --- src/gallium/drivers/radeonsi/si_shader.c | 9 - src/gallium/drivers/radeonsi/si_shader.h | 18 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 6 +- 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 4e8f80f..8680824 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2402,8 +2402,8 @@ static void create_function(struct si_shader_context *si_shader_ctx) v8i32 = LLVMVectorType(i32, 8); v16i8 = LLVMVectorType(i8, 16); - params[SI_PARAM_CONST] = const_array(v16i8, SI_NUM_CONST_BUFFERS); params[SI_PARAM_RW_BUFFERS] = const_array(v16i8, SI_NUM_RW_BUFFERS); + params[SI_PARAM_CONST] = const_array(v16i8, SI_NUM_CONST_BUFFERS); params[SI_PARAM_SAMPLER] = const_array(v4i32, SI_NUM_SAMPLER_STATES); params[SI_PARAM_RESOURCE] = const_array(v8i32, SI_NUM_SAMPLER_VIEWS); last_array_pointer = SI_PARAM_RESOURCE; @@ -2415,10 +2415,16 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_BASE_VERTEX] = i32; params[SI_PARAM_START_INSTANCE] = i32; num_params = SI_PARAM_START_INSTANCE+1; + if (shader->key.vs.as_es) { params[SI_PARAM_ES2GS_OFFSET] = i32; num_params++; } else { + if (shader->is_gs_copy_shader) { + last_array_pointer = SI_PARAM_CONST; + num_params = SI_PARAM_CONST+1; + } + /* The locations of the other parameters are assigned dynamically. */ /* Streamout SGPRs. */ @@ -2716,6 +2722,7 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen, outputs = MALLOC(gs->noutput * sizeof(outputs[0])); si_shader_ctx->type = TGSI_PROCESSOR_VERTEX; + shader->is_gs_copy_shader = true; radeon_llvm_context_init(&si_shader_ctx->radeon_bld); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index c0e5cf4..11e5ae0 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -33,10 +33,10 @@ #include "tgsi/tgsi_scan.h" #include "si_state.h" -#define SI_SGPR_CONST 0 -#define SI_SGPR_SAMPLER2 -#define SI_SGPR_RESOURCE 4 -#define SI_SGPR_RW_BUFFERS 6 /* rings (& stream-out, VS only) */ +#define SI_SGPR_RW_BUFFERS 0 /* rings (& stream-out, VS only) */ +#define SI_SGPR_CONST 2 +#define SI_SGPR_SAMPLER4 +#define SI_SGPR_RESOURCE 6 #define SI_SGPR_VERTEX_BUFFER 8 /* VS only */ #define SI_SGPR_BASE_VERTEX10 /* VS only */ #define SI_SGPR_START_INSTANCE 11 /* VS only */ @@ -44,13 +44,14 @@ #define SI_VS_NUM_USER_SGPR12 #define SI_GS_NUM_USER_SGPR8 +#define SI_GSCOPY_NUM_USER_SGPR4 #define SI_PS_NUM_USER_SGPR9 /* LLVM function parameter indices */ -#define SI_PARAM_CONST 0 -#define SI_PARAM_SAMPLER 1 -#define SI_PARAM_RESOURCE 2 -#define SI_PARAM_RW_BUFFERS3 +#define SI_PARAM_RW_BUFFERS0 +#define SI_PARAM_CONST 1 +#define SI_PARAM_SAMPLER 2 +#define SI_PARAM_RESOURCE 3 /* VS only parameters */ #define SI_PARAM_VERTEX_BUFFER 4 @@ -183,6 +184,7 @@ struct si_shader { boolvs_out_layer; unsignednr_pos_exports; unsignedclip_dist_write; + boolis_gs_copy_shader; }; static inline struct si_shader* si_get_vs_state(struct si_context *sctx) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 6ad2df0..e8d84a9 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -166,7 +166,11 @@ static void si_shader_vs(struct pipe_context *ctx, struct si_shader *shader) vgpr_comp_cnt = shader->uses_instanceid ? 3 : 0; - num_user_sgprs = SI_VS_NUM_USER_SGPR; + if (shader->is_gs_copy_shader) + num_user_sgprs = SI_GSCOPY_NUM_USER_SGPR; + else + num_user_sgprs = SI_VS_NUM_USER_SGPR; + num_sgprs = shader->num_sgprs; if (num_user_sgprs > num_sgprs) { /* Last 2 reserved SGPRs are used for VCC */ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/13] radeonsi: remove unused variable si_shader::gs_input_prim
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 2 -- src/gallium/drivers/radeonsi/si_shader.h | 1 - 2 files changed, 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index e76b969..c5f13be 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2867,8 +2867,6 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs; bld_base->emit_epilogue = si_llvm_emit_gs_epilogue; - shader->gs_input_prim = - sel->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0]; shader->gs_output_prim = sel->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0]; shader->gs_max_out_vertices = diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index c6026bd..827f79e 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -172,7 +172,6 @@ struct si_shader { struct si_shader_output output[40]; /* geometry shader properties */ - unsignedgs_input_prim; unsignedgs_output_prim; unsignedgs_max_out_vertices; uint64_tgs_used_inputs; /* mask of "get_unique_index" bits */ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/13] radeonsi: don't flush shader caches when building PM4 shader states
From: Marek Olšák This is a wrong place to flush caches to say the least. I don't think we need to flush the instruction caches if we don't patch shaders with DMA. --- src/gallium/drivers/radeonsi/si_state_draw.c | 8 1 file changed, 8 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 2881199..6ad2df0 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -75,8 +75,6 @@ static void si_shader_es(struct pipe_context *ctx, struct si_shader *shader) S_00B328_VGPR_COMP_CNT(vgpr_comp_cnt)); si_pm4_set_reg(pm4, R_00B32C_SPI_SHADER_PGM_RSRC2_ES, S_00B32C_USER_SGPR(num_user_sgprs)); - - sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE; } static void si_shader_gs(struct pipe_context *ctx, struct si_shader *shader) @@ -147,8 +145,6 @@ static void si_shader_gs(struct pipe_context *ctx, struct si_shader *shader) S_00B228_SGPRS((num_sgprs - 1) / 8)); si_pm4_set_reg(pm4, R_00B22C_SPI_SHADER_PGM_RSRC2_GS, S_00B22C_USER_SGPR(num_user_sgprs)); - - sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE; } static void si_shader_vs(struct pipe_context *ctx, struct si_shader *shader) @@ -223,8 +219,6 @@ static void si_shader_vs(struct pipe_context *ctx, struct si_shader *shader) S_00B12C_SO_BASE2_EN(!!shader->selector->so.stride[2]) | S_00B12C_SO_BASE3_EN(!!shader->selector->so.stride[3]) | S_00B12C_SO_EN(!!shader->selector->so.num_outputs)); - - sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE; } static void si_shader_ps(struct pipe_context *ctx, struct si_shader *shader) @@ -305,8 +299,6 @@ static void si_shader_ps(struct pipe_context *ctx, struct si_shader *shader) si_pm4_set_reg(pm4, R_00B02C_SPI_SHADER_PGM_RSRC2_PS, S_00B02C_EXTRA_LDS_SIZE(shader->lds_size) | S_00B02C_USER_SGPR(num_user_sgprs)); - - sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE; } /* -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on
https://bugs.freedesktop.org/show_bug.cgi?id=81680 --- Comment #41 from Ernst Sjöstrand --- No longer crashes after applying the patch here! -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2
On 09/30/2014 06:13 PM, Ian Romanick wrote: On 09/30/2014 12:28 AM, Tapani Pälli wrote: Patch fixes failing test in WebGL conformance test 'point-no-attributes' when running Chrome on OpenGL ES. (Shader program may draw points using constant data in shader.) No Piglit regressions. This sounds believable. Did you also try the ES2 or ES3 conformance suite? I could have sworn that we had a bug related to this a long time ago, and we discovered it using the conformance suite. Did not check non-web conformance suite but I can give it a try. Either way, we should get a piglit test too... I think we have a test for desktop OpenGL (maybe 3.1?), so it shouldn't be too hard to adapt that. OK, I will make tests (for existing fixes + there's still bunch of other failures left to fix). So far I've just used the conformance tests online to test these change and Piglit to catch possible regressions. Signed-off-by: Tapani Pälli --- src/mesa/main/api_validate.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c index 51a3d1f..9b80600 100644 --- a/src/mesa/main/api_validate.c +++ b/src/mesa/main/api_validate.c @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char *function) switch (ctx->API) { case API_OPENGLES2: - /* For ES2, we can draw if any vertex array is enabled (and we - * should always have a vertex program/shader). */ - if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current) + /* For ES2, we can draw if we have a vertex program/shader). */ + if (!ctx->VertexProgram._Current) return GL_FALSE; break; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64
On Tue, Sep 30, 2014 at 12:29:52PM -0400, Ilia Mirkin wrote: > Perhaps do the same thing as util_bitcount, i.e. > > #if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304) > return __builtin_popcountll(n); > #else > ... > #endif > > Perhaps the gcc version check is no longer necessary, unlikely > anyone's using gcc3.3 or earlier at this point. But whatever. > I saw a patch from Matt recently that added autoconf checks for a bunch of different builtin functions, I think we should use those instead. -Tom > On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák wrote: > > From: Marek Olšák > > > > I'll need this in radeonsi. > > --- > > src/gallium/auxiliary/util/u_math.h | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/src/gallium/auxiliary/util/u_math.h > > b/src/gallium/auxiliary/util/u_math.h > > index 39bd40f..48d5c31 100644 > > --- a/src/gallium/auxiliary/util/u_math.h > > +++ b/src/gallium/auxiliary/util/u_math.h > > @@ -727,6 +727,14 @@ util_bitcount(unsigned n) > > #endif > > } > > > > + > > +static INLINE unsigned > > +util_bitcount64(uint64_t n) > > +{ > > + return util_bitcount(n) + util_bitcount(n >> 32); > > +} > > + > > + > > /** > > * Reverse bits in n > > * Algorithm taken from: > > -- > > 1.9.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64
On Tue, Sep 30, 2014 at 1:14 PM, Tom Stellard wrote: > On Tue, Sep 30, 2014 at 12:29:52PM -0400, Ilia Mirkin wrote: >> Perhaps do the same thing as util_bitcount, i.e. >> >> #if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304) >> return __builtin_popcountll(n); >> #else >> ... >> #endif >> >> Perhaps the gcc version check is no longer necessary, unlikely >> anyone's using gcc3.3 or earlier at this point. But whatever. >> > > I saw a patch from Matt recently that added autoconf checks for a bunch > of different builtin functions, I think we should use those instead. That sounds way better, but these version checks are all over u_math.h -- feels like a separate cleanup, not necessary to saddle this simple change with working out how autoconf works :) But if Marek wants to do it, I won't object... > > -Tom > >> On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák wrote: >> > From: Marek Olšák >> > >> > I'll need this in radeonsi. >> > --- >> > src/gallium/auxiliary/util/u_math.h | 8 >> > 1 file changed, 8 insertions(+) >> > >> > diff --git a/src/gallium/auxiliary/util/u_math.h >> > b/src/gallium/auxiliary/util/u_math.h >> > index 39bd40f..48d5c31 100644 >> > --- a/src/gallium/auxiliary/util/u_math.h >> > +++ b/src/gallium/auxiliary/util/u_math.h >> > @@ -727,6 +727,14 @@ util_bitcount(unsigned n) >> > #endif >> > } >> > >> > + >> > +static INLINE unsigned >> > +util_bitcount64(uint64_t n) >> > +{ >> > + return util_bitcount(n) + util_bitcount(n >> 32); >> > +} >> > + >> > + >> > /** >> > * Reverse bits in n >> > * Algorithm taken from: >> > -- >> > 1.9.1 >> > >> > ___ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl()
Reviewed-by: Charmaine Lee From: mesa-dev on behalf of Brian Paul Sent: Tuesday, September 30, 2014 9:31 AM To: mesa-dev@lists.freedesktop.org Subject: [Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl() Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC. Fixes polygon stipple regression. --- src/gallium/auxiliary/tgsi/tgsi_transform.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.h b/src/gallium/auxiliary/tgsi/tgsi_transform.h index bfcdd56..921aa90 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_transform.h +++ b/src/gallium/auxiliary/tgsi/tgsi_transform.h @@ -120,7 +120,7 @@ tgsi_transform_input_decl(struct tgsi_transform_context *ctx, decl.Declaration.File = TGSI_FILE_INPUT; decl.Declaration.Interpolate = 1; decl.Declaration.Semantic = 1; - decl.Semantic.Name = TGSI_SEMANTIC_GENERIC; + decl.Semantic.Name = sem_name; decl.Semantic.Index = sem_index; decl.Range.First = decl.Range.Last = index; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=iVNYIcCaC9TDvyNBQU%2F5q5NVsC01tSgJb3oX27T14ck%3D%0A&m=kdSMDzhhfBB7r7%2BtTT8ZJLsLWFgmZ6ruSleqmdygkOs%3D%0A&s=b935ac45947463251948f10239ee0f3612e74bc5601c500e5141ffdec63d0f32 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: move lp_jit_screen_init() call after allocation of screen object
Am 30.09.2014 15:16, schrieb Brian Paul: > The screen argument isn't actually used by lp_jit_screen_init() at this > time, but let's move the call so that we pass a valid pointer. > > v2: don't leak screen if lp_jit_screen_init() fails. > --- > src/gallium/drivers/llvmpipe/lp_screen.c |8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > b/src/gallium/drivers/llvmpipe/lp_screen.c > index 3025322..a264f99 100644 > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > @@ -557,9 +557,6 @@ llvmpipe_create_screen(struct sw_winsys *winsys) > return NULL; > #endif > > - if (!lp_jit_screen_init(screen)) > - return NULL; > - > #ifdef DEBUG > LP_DEBUG = debug_get_flags_option("LP_DEBUG", lp_debug_flags, 0 ); > #endif > @@ -570,6 +567,11 @@ llvmpipe_create_screen(struct sw_winsys *winsys) > if (!screen) >return NULL; > > + if (!lp_jit_screen_init(screen)) { > + FREE(screen); > + return NULL; > + } > + > screen->winsys = winsys; > > screen->base.destroy = llvmpipe_destroy_screen; > Reviewed-by: Roland Scheidegger ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/brw_reg: Make the accumulator register take an explicit width.
The big pile of patches I just pushed regresses about 25 piglit tests on SNB. This fixes the regressions. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 12 src/mesa/drivers/dri/i965/brw_reg.h| 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 8 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 9f65b1f..89ac7e2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -636,7 +636,8 @@ fs_visitor::visit(ir_expression *ir) if (brw->gen >= 7) no16("SIMD16 explicit accumulator operands unsupported\n"); -struct brw_reg acc = retype(brw_acc_reg(), this->result.type); +struct brw_reg acc = retype(brw_acc_reg(dispatch_width), +this->result.type); emit(MUL(acc, op[0], op[1])); emit(MACH(reg_null_d, op[0], op[1])); @@ -650,7 +651,8 @@ fs_visitor::visit(ir_expression *ir) if (brw->gen >= 7) no16("SIMD16 explicit accumulator operands unsupported\n"); - struct brw_reg acc = retype(brw_acc_reg(), this->result.type); + struct brw_reg acc = retype(brw_acc_reg(dispatch_width), + this->result.type); emit(MUL(acc, op[0], op[1])); emit(MACH(this->result, op[0], op[1])); @@ -665,7 +667,8 @@ fs_visitor::visit(ir_expression *ir) if (brw->gen >= 7) no16("SIMD16 explicit accumulator operands unsupported\n"); - struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD); + struct brw_reg acc = retype(brw_acc_reg(dispatch_width), + BRW_REGISTER_TYPE_UD); emit(ADDC(reg_null_ud, op[0], op[1])); emit(MOV(this->result, fs_reg(acc))); @@ -675,7 +678,8 @@ fs_visitor::visit(ir_expression *ir) if (brw->gen >= 7) no16("SIMD16 explicit accumulator operands unsupported\n"); - struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD); + struct brw_reg acc = retype(brw_acc_reg(dispatch_width), + BRW_REGISTER_TYPE_UD); emit(SUBB(reg_null_ud, op[0], op[1])); emit(MOV(this->result, fs_reg(acc))); diff --git a/src/mesa/drivers/dri/i965/brw_reg.h b/src/mesa/drivers/dri/i965/brw_reg.h index 2e110d6..19af0ae 100644 --- a/src/mesa/drivers/dri/i965/brw_reg.h +++ b/src/mesa/drivers/dri/i965/brw_reg.h @@ -639,9 +639,10 @@ brw_ip_reg(void) } static inline struct brw_reg -brw_acc_reg(void) +brw_acc_reg(unsigned width) { - return brw_vec8_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ACCUMULATOR, 0); + return brw_vecn_reg(width, BRW_ARCHITECTURE_REGISTER_FILE, + BRW_ARF_ACCUMULATOR, 0); } static inline struct brw_reg diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 9299029..f03cf4f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -1455,7 +1455,7 @@ vec4_visitor::visit(ir_expression *ir) else emit(MUL(result_dst, op[0], op[1])); } else { -struct brw_reg acc = retype(brw_acc_reg(), result_dst.type); +struct brw_reg acc = retype(brw_acc_reg(8), result_dst.type); emit(MUL(acc, op[0], op[1])); emit(MACH(dst_null_d(), op[0], op[1])); @@ -1466,7 +1466,7 @@ vec4_visitor::visit(ir_expression *ir) } break; case ir_binop_imul_high: { - struct brw_reg acc = retype(brw_acc_reg(), result_dst.type); + struct brw_reg acc = retype(brw_acc_reg(8), result_dst.type); emit(MUL(acc, op[0], op[1])); emit(MACH(result_dst, op[0], op[1])); @@ -1478,14 +1478,14 @@ vec4_visitor::visit(ir_expression *ir) emit_math(SHADER_OPCODE_INT_QUOTIENT, result_dst, op[0], op[1]); break; case ir_binop_carry: { - struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD); + struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_UD); emit(ADDC(dst_null_ud(), op[0], op[1])); emit(MOV(result_dst, src_reg(acc))); break; } case ir_binop_borrow: { - struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD); + struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_UD); emit(SUBB(dst_null_ud(), op[0], op[1])); emit(MOV(result_dst, src_reg(acc))); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] galahad: fix indirect draw
From: Roland Scheidegger Need to unwrap the indirect resource otherwise bad things will happen. Fixes random crashes and timeouts with piglit's arb_indirect_draw tests. --- src/gallium/drivers/galahad/glhd_context.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/galahad/glhd_context.c b/src/gallium/drivers/galahad/glhd_context.c index 79d5495..37ea170 100644 --- a/src/gallium/drivers/galahad/glhd_context.c +++ b/src/gallium/drivers/galahad/glhd_context.c @@ -49,7 +49,7 @@ galahad_context_destroy(struct pipe_context *_pipe) static void galahad_context_draw_vbo(struct pipe_context *_pipe, - const struct pipe_draw_info *info) + const struct pipe_draw_info *info) { struct galahad_context *glhd_pipe = galahad_context(_pipe); struct pipe_context *pipe = glhd_pipe->pipe; @@ -58,7 +58,14 @@ galahad_context_draw_vbo(struct pipe_context *_pipe, * before drawing. */ - pipe->draw_vbo(pipe, info); + if (info->indirect) { + struct pipe_draw_info info_unwrapped = *info; + info_unwrapped.indirect = galahad_resource_unwrap(info->indirect); + pipe->draw_vbo(pipe, &info_unwrapped); + } + else { + pipe->draw_vbo(pipe, info); + } } static struct pipe_query * -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] galahad: (trivial) handle cubemap arrays
From: Roland Scheidegger --- src/gallium/drivers/galahad/glhd_screen.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/gallium/drivers/galahad/glhd_screen.c b/src/gallium/drivers/galahad/glhd_screen.c index 5a91077..11ab1a9 100644 --- a/src/gallium/drivers/galahad/glhd_screen.c +++ b/src/gallium/drivers/galahad/glhd_screen.c @@ -176,6 +176,13 @@ galahad_screen_resource_create(struct pipe_screen *_screen, glhd_check("%u", templat->height0, == templat->width0); glhd_check("%u", templat->depth0, == 1); glhd_check("%u", templat->array_size, == 6); + } else if (templat->target == PIPE_TEXTURE_CUBE_ARRAY) { + unsigned max_texture_cube_levels = screen->get_param(screen, PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS); + glhd_check("%u", templat->last_level, < max_texture_cube_levels); + glhd_check("%u", templat->width0, <= (1 << (max_texture_cube_levels - 1))); + glhd_check("%u", templat->height0, == templat->width0); + glhd_check("%u", templat->depth0, == 1); + glhd_check("%u", templat->array_size, % 6 == 0); } else if (templat->target == PIPE_TEXTURE_RECT) { unsigned max_texture_2d_levels = screen->get_param(screen, PIPE_CAP_MAX_TEXTURE_2D_LEVELS); glhd_check("%u", templat->last_level, == 0); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/brw_reg: Make the accumulator register take an explicit width.
Assuming no regressions on other platforms: Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3 1/4] mesa: Add new variables in gl_context to store sample layout
On Mon, Sep 29, 2014 at 7:16 PM, Jordan Justen wrote: > > On 2014-09-29 16:33:33, Anuj Phogat wrote: > > SampleMap{2,4,8}x variables are used in later patches to implement > > EXT_framebuffer_multisample_blit_scaled extension. > > > > V2: Use integer array instead of a string. > > Bump up the comment. > > > > Signed-off-by: Anuj Phogat > > --- > > src/mesa/main/mtypes.h | 32 > > 1 file changed, 32 insertions(+) > > > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > > index 0d50be8..162dc44 100644 > > --- a/src/mesa/main/mtypes.h > > +++ b/src/mesa/main/mtypes.h > > @@ -3608,6 +3608,38 @@ struct gl_constants > > GLint MaxDepthTextureSamples; > > GLint MaxIntegerSamples; > > > > + /** > > +* GL_EXT_texture_multisample_blit_scaled implementation assumes that > > +* samples are laid out in a rectangular grid roughly corresponding to > > +* sample locations within a pixel. Below SampleMap{2,4,8}x variables > > +* are used to map indices of rectangular grid to sample numbers within > > +* a pixel. This mapping of indices to sample numbers must be > > initialized > > +* by the driver for the target hardware. For example, if we have the 8X > > +* MSAA sample number layout (sample positions) for XYZ hardware: > > +* > > +*sample indices layout sample number layout > > +*- - > > +*| 0 | 1 | | a | b | > > +*- - > > +*| 2 | 3 | | c | d | > > +*- - > > +*| 4 | 5 | | e | f | > > +*- - > > +*| 6 | 7 | | g | h | > > +*- - > > +* > > +* Where a,b,c,d,e,f,g,h are integers between [0-7]. > > +* > > +* Then, initialize the SampleMap8x variable for XYZ hardware as shown > > +* below: > > +*SampleMap8x = {a, b, c, d, e, f, g, h}; > > +* > > +* Follow the logic for other sample counts. > > +*/ > > + unsigned *SampleMap2x; > > + unsigned *SampleMap4x; > > + unsigned *SampleMap8x; > > Wouldn't uint8_t work given the 0-7 range? > Yes, that'll work. > Also, I thought we could include the bytes directly in the structure, > and let the driver write the bytes rather than setting a pointer to > the array. > >uint8_t SampleMap2x[2]; >uint8_t SampleMap4x[4]; >uint8_t SampleMap8x[8]; > I'll make the changes and send out a V4. > -Jordan > > > /** GL_ARB_shader_atomic_counters */ > > GLuint MaxAtomicBufferBindings; > > GLuint MaxAtomicBufferSize; > > -- > > 1.9.3 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64
On Tue, Sep 30, 2014 at 9:26 AM, Marek Olšák wrote: > From: Marek Olšák > > I'll need this in radeonsi. > --- > src/gallium/auxiliary/util/u_math.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/gallium/auxiliary/util/u_math.h > b/src/gallium/auxiliary/util/u_math.h > index 39bd40f..48d5c31 100644 > --- a/src/gallium/auxiliary/util/u_math.h > +++ b/src/gallium/auxiliary/util/u_math.h > @@ -727,6 +727,14 @@ util_bitcount(unsigned n) > #endif > } > > + > +static INLINE unsigned > +util_bitcount64(uint64_t n) > +{ > + return util_bitcount(n) + util_bitcount(n >> 32); There's a __builtin_popcountll that operates on a 64-bit value directly. You should probably use that instead. We already use it and test for it with autoconf -- just check #ifdef HAVE___BUILTIN_POPCOUNTLL. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on
https://bugs.freedesktop.org/show_bug.cgi?id=81680 --- Comment #42 from Benjamin Bellec --- (In reply to comment #40) > Created attachment 107124 [details] [review] > possible fix > > Could you please test this patch? Tested-by: Benjamin Bellec Your patch fixes the crash. Tested on Evergreen. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits
On Tue, Sep 30, 2014 at 8:50 AM, Ian Romanick wrote: > On 09/20/2014 07:41 PM, Matt Turner wrote: >> On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes wrote: >>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >>> index 79d2e94..c11ad4f 100644 >>> --- a/src/mesa/main/shaderapi.c >>> +++ b/src/mesa/main/shaderapi.c >>> @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void) >>> void >>> _mesa_init_shader_state(struct gl_context *ctx) >>> { >>> + int i; >> >> In context, this declaration looks odd. Move it below the two just >> after this hunk? > > Not in core Mesa where we have to do dumb ol' C89. :( Move it after the other two variable declarations... /* Device drivers may override these to control what kind of instructions * are generated by the GLSL compiler. */ struct gl_shader_compiler_options options; gl_shader_stage sh; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.
On Tuesday, September 30, 2014 10:33:42 AM Daniel Vetter wrote: > On Tue, Sep 30, 2014 at 01:15:56AM -0700, Kenneth Graunke wrote: > > Write-back caching cannot be used for buffers being scanned out by the > > display engine; surfaces used for scan-out must be write-through or > > uncached. I originally chose WT for render targets because it works in > > all cases. However, we really want to use write-back caching where > > possible, as it is more efficient. > > > > Most renderbuffers are not used for scanout - off-screen FBOs certainly > > are fine, and non-pageflipped backbuffers should be fine as well. So > > in most cases WB will work. However, we don't know what will be used > > for scan-out, so we instead simply use the PTE value specified by the > > kernel, as it knows these things. > > > > This matches our MOCS choice on Haswell. > > > > Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5 > > in a microbenchmark (spotted by Eero Tamminen). Improves performance > > in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a > > Broadwell GT2. > > > > Signed-off-by: Kenneth Graunke > > Reported-by: Eero Tamminen > > Cc: mesa-sta...@lists.freedesktop.org > > --- > > src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > Cc'd to stable because it's a pretty trivial change and provides a sizable > > boost to performance on new hardware. > > Both patches are Reviewed-by: Daniel Vetter > > Aside: Not using WT on display can lead to corruption (apparently bdw is > fairly aggressive with writeback so hard to spot in reality), so imo > definitely stable material. > > With the hw display crc stuff we now support in the kernel/igt we could > even write an automated testcase for these corruptions, but probably not > worth the hassle. > -Daniel Well, we should have already been using WT when writing anything that hits the display. The advantage of using the PTE entries' cache mode is that we should get WB for most (non-displayed) surfaces, but still get WT for anything displayed. Thanks for the review! --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] SandyBridge's 'resinfo' -> returned value for SURFTYPE_BUFFER?
On Tue, Sep 30, 2014 at 5:22 AM, Samuel Iglesias Gonsálvez wrote: > Hello, > > I am looking at bug 57439 [0] where it shows an error > in a piglit test [1] related to textureSize() function happening > in Intel SandyBridge hardware. > > According to SNB's PRM documentation (vol4 part1 page 141), the > returned value for SURFTYPE_BUFFER (the surface type used in the test) > is not defined in the 'resinfo' message type. For IvyBridge's doc it is > defined as the buffer size, which is calculated from combined > Depth/Height/Width values. > > As it is not clear that SNB returns the same value than IVB for that > kind of message and surface type, I send this email here asking for a > clarification :-) Yes, I can confirm that the internal BSpec says on Sandybridge resinfo for SURFTYPE_BUFFER (and SURFTYPE_STRBUF, same thing?) returns undefined results in all channels. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 1/4] mesa: Add new variables in gl_context to store sample layout
SampleMap{2,4,8}x variables are used in later patches to implement EXT_framebuffer_multisample_blit_scaled extension. V2: Use integer array instead of a string. Bump up the comment. V3: Use uint8_t type array. Signed-off-by: Anuj Phogat --- src/mesa/main/mtypes.h | 32 1 file changed, 32 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 0d50be8..258531b 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3608,6 +3608,38 @@ struct gl_constants GLint MaxDepthTextureSamples; GLint MaxIntegerSamples; + /** +* GL_EXT_texture_multisample_blit_scaled implementation assumes that +* samples are laid out in a rectangular grid roughly corresponding to +* sample locations within a pixel. Below SampleMap{2,4,8}x variables +* are used to map indices of rectangular grid to sample numbers within +* a pixel. This mapping of indices to sample numbers must be initialized +* by the driver for the target hardware. For example, if we have the 8X +* MSAA sample number layout (sample positions) for XYZ hardware: +* +*sample indices layout sample number layout +*- - +*| 0 | 1 | | a | b | +*- - +*| 2 | 3 | | c | d | +*- - +*| 4 | 5 | | e | f | +*- - +*| 6 | 7 | | g | h | +*- - +* +* Where a,b,c,d,e,f,g,h are integers between [0-7]. +* +* Then, initialize the SampleMap8x variable for XYZ hardware as shown +* below: +*SampleMap8x = {a, b, c, d, e, f, g, h}; +* +* Follow the logic for other sample counts. +*/ + uint8_t SampleMap2x[2]; + uint8_t SampleMap4x[4]; + uint8_t SampleMap8x[8]; + /** GL_ARB_shader_atomic_counters */ GLuint MaxAtomicBufferBindings; GLuint MaxAtomicBufferSize; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 2/4] i965: Initialize the SampleMap{2, 4, 8}x variables
with values specific to Intel hardware. V2: Define and use gen6_get_sample_map() function to initialize the variables. V3: Change the function name to gen6_set_sample_maps() and use memcpy() to fill in the data. Signed-off-by: Anuj Phogat --- src/mesa/drivers/dri/i965/brw_context.c| 8 src/mesa/drivers/dri/i965/brw_context.h| 2 + src/mesa/drivers/dri/i965/gen6_multisample_state.c | 45 ++ 3 files changed, 55 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 619f2d5..ebe6a50 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -406,6 +406,14 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.MaxDepthTextureSamples = max_samples; ctx->Const.MaxIntegerSamples = max_samples; + /* gen6_set_sample_maps() sets SampleMap{2,4,8}x variables which are used +* to map indices of rectangular grid to sample numbers within a pixel. +* These variables are used by GL_EXT_framebuffer_multisample_blit_scaled +* extension implementation. For more details see the comment above +* gen6_set_sample_maps() definition. +*/ + gen6_set_sample_maps(ctx); + if (brw->gen >= 7) ctx->Const.MaxProgramTextureGatherComponents = 4; else if (brw->gen == 6) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 5830aa99..e0f2e6b 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1660,6 +1660,8 @@ gen6_get_sample_position(struct gl_context *ctx, struct gl_framebuffer *fb, GLuint index, GLfloat *result); +void +gen6_set_sample_maps(struct gl_context *ctx); /* gen8_multisample_state.c */ void gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samp); diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c b/src/mesa/drivers/dri/i965/gen6_multisample_state.c index 429a590..ee20c08 100644 --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c @@ -57,6 +57,51 @@ gen6_get_sample_position(struct gl_context *ctx, } /** + * Sample index layout shows the numbering of slots in a rectangular + * grid of samples with in a pixel. Sample number layout shows the + * rectangular grid of samples roughly corresponding to the real sample + * locations with in a pixel. Sample number layout matches the sample + * index layout in case of 2X and 4x MSAA, but they are different in + * case of 8X MSAA. + * + * 2X MSAA sample index / number layout + * - + * | 0 | 1 | + * - + * + * 4X MSAA sample index / number layout + * - + * | 0 | 1 | + * - + * | 2 | 3 | + * - + * + * 8X MSAA sample index layout8x MSAA sample number layout + * - - + * | 0 | 1 | | 5 | 2 | + * - - + * | 2 | 3 | | 4 | 6 | + * - - + * | 4 | 5 | | 0 | 3 | + * - - + * | 6 | 7 | | 7 | 1 | + * - - + * + * A sample map is used to map sample indices to sample numbers. + */ +void +gen6_set_sample_maps(struct gl_context *ctx) +{ + uint8_t map_2x[2] = {0, 1}; + uint8_t map_4x[4] = {0, 1, 2, 3}; + uint8_t map_8x[8] = {5, 2, 4, 6, 0, 3, 7, 1}; + + memcpy(ctx->Const.SampleMap2x, map_2x, sizeof(map_2x)); + memcpy(ctx->Const.SampleMap4x, map_4x, sizeof(map_4x)); + memcpy(ctx->Const.SampleMap8x, map_8x, sizeof(map_8x)); +} + +/** * 3DSTATE_MULTISAMPLE */ void -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 3/4] meta: Implement ext_framebuffer_multisample_blit_scaled extension
Extension enables doing a multisample buffer resolve and buffer scaling using a single glBlitFrameBuffer() call. Currently, we have this extension implemented in BLORP which is only used by SNB and IVB. This patch implements the extension in meta path which makes it available to Broadwell. Implementation features: - Supports scaled resolves of 2X, 4X and 8X multisample buffers. - Avoids unnecessary shader compilations by storing the pre compiled shaders for each supported sample count. - Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed behavior in the extension's spec. - I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT filter. It made the edges in the image look little smoother but the image gets blurred causing no overall quality improvement. For now I have dropped the idea of doing different filtering for nicest filter. V2: - Minor changes to simplify the fragment shader. - Refactor the code to move i965 specific sample_map computation out of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized by the driver. - Use a simple msaa resolve shader for scaled resolves with scaling factor = 1.0. V3: - Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x variables and use it in fragment shader. V4: - Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x variables. Signed-off-by: Anuj Phogat --- src/mesa/drivers/common/meta.h | 6 ++ src/mesa/drivers/common/meta_blit.c | 206 +--- 2 files changed, 199 insertions(+), 13 deletions(-) diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h index edc3e8c..2c9517b 100644 --- a/src/mesa/drivers/common/meta.h +++ b/src/mesa/drivers/common/meta.h @@ -279,6 +279,12 @@ enum blit_msaa_shader { BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_COPY_UINT, BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_DEPTH_RESOLVE, BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_DEPTH_COPY, + BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE, + BLIT_4X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE, + BLIT_8X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE, + BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE, + BLIT_4X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE, + BLIT_8X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE, BLIT_MSAA_SHADER_COUNT, }; diff --git a/src/mesa/drivers/common/meta_blit.c b/src/mesa/drivers/common/meta_blit.c index fc9848a..c7ff2b1 100644 --- a/src/mesa/drivers/common/meta_blit.c +++ b/src/mesa/drivers/common/meta_blit.c @@ -55,6 +55,179 @@ #define OFFSET(FIELD) ((void *) offsetof(struct vertex, FIELD)) static void +setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx, + struct blit_state *blit, + struct gl_renderbuffer *src_rb, + GLenum target, GLenum filter) +{ + GLint loc_src_width, loc_src_height; + int i, samples; + int shader_offset = 0; + void *mem_ctx = ralloc_context(NULL); + char *fs_source; + char *name, *sample_number; + const uint8_t *sample_map; + char *sample_map_str = rzalloc_size(mem_ctx, 1); + char *sample_map_expr = rzalloc_size(mem_ctx, 1); + char *texel_fetch_macro = rzalloc_size(mem_ctx, 1);; + const char *vs_source; + const char *sampler_array_suffix = ""; + const char *texcoord_type = "vec2"; + float y_scale; + enum blit_msaa_shader shader_index; + + assert(src_rb); + samples = MAX2(src_rb->NumSamples, 1); + y_scale = samples * 0.5; + + /* We expect only power of 2 samples in source multisample buffer. */ + assert((samples & (samples - 1)) == 0); + while (samples >> (shader_offset + 1)) { + shader_offset++; + } + /* Update the assert if we plan to support more than 8X MSAA. */ + assert(shader_offset > 0 && shader_offset < 4); + + assert(target == GL_TEXTURE_2D_MULTISAMPLE || + target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY); + + shader_index = BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE + + shader_offset - 1; + + if (target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY) { + shader_index += BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE - + BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE; + sampler_array_suffix = "Array"; + texcoord_type = "vec3"; + } + + if (blit->msaa_shaders[shader_index]) { + _mesa_UseProgram(blit->msaa_shaders[shader_index]); + /* Update the uniform values. */ + loc_src_width = + glGetUniformLocation(blit->msaa_shaders[shader_index], "src_width"); + loc_src_height = + glGetUniformLocation(blit->msaa_shaders[shader_index], "src_height"); + glUniform1f(loc_src_width, src_rb->Width); + glUniform1f(loc_src_height, src_rb->Height); + return; + } + + name = ralloc_asprintf(mem_ctx, "vec4 MSAA sc
[Mesa-dev] [PATCH V4 4/4] i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8
Signed-off-by: Anuj Phogat --- src/mesa/drivers/dri/i965/intel_extensions.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 046d2a1..10fe10e 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -256,8 +256,7 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.EXT_framebuffer_multisample = true; ctx->Extensions.EXT_transform_feedback = true; - if (brw->gen < 8) - ctx->Extensions.EXT_framebuffer_multisample_blit_scaled = true; + ctx->Extensions.EXT_framebuffer_multisample_blit_scaled = true; ctx->Extensions.ARB_blend_func_extended = !driQueryOptionb(&brw->optionCache, "disable_blend_func_extended"); ctx->Extensions.ARB_draw_buffers_blend = true; ctx->Extensions.ARB_ES3_compatibility = true; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6
Jose, On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote: > That said, the way we use these things are still a bit in flux. Mathias > has some pending patches. BTW, Mathis, should I submit your patches > for making llvmpipe thread safe? Mesa day for me. I did double check the mesa compile with different llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope to have caught all configs), which did only require marginal rebase changes. That's what I pushed finally. Greetings Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info
Am 30.09.2014 18:46, schrieb Marek Olšák: > From: Marek Olšák > > Use an array of properties indexed by TGSI_PROPERTY_* definitions. > --- > src/gallium/auxiliary/draw/draw_gs.c | 23 - > src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 15 +++--- > src/gallium/auxiliary/tgsi/tgsi_scan.c | 59 > ++-- > src/gallium/auxiliary/tgsi/tgsi_scan.h | 6 +-- > src/gallium/auxiliary/util/u_pstipple.c | 8 +--- > src/gallium/drivers/llvmpipe/lp_state_fs.c | 10 +--- > src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++--- > src/gallium/drivers/r300/r300_fs.c | 8 +--- > src/gallium/drivers/radeonsi/si_shader.c | 53 +++-- > 9 files changed, 70 insertions(+), 136 deletions(-) > > diff --git a/src/gallium/auxiliary/draw/draw_gs.c > b/src/gallium/auxiliary/draw/draw_gs.c > index 878fcca..0c2f892 100644 > --- a/src/gallium/auxiliary/draw/draw_gs.c > +++ b/src/gallium/auxiliary/draw/draw_gs.c > @@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw, > tgsi_scan_shader(state->tokens, &gs->info); > > /* setup the defaults */ > - gs->input_primitive = PIPE_PRIM_TRIANGLES; > - gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP; > - gs->max_output_vertices = 32; > gs->max_out_prims = 0; > > #ifdef HAVE_LLVM > @@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw, >gs->vector_length = 1; > } > > - for (i = 0; i < gs->info.num_properties; ++i) { > - if (gs->info.properties[i].name == > - TGSI_PROPERTY_GS_INPUT_PRIM) > - gs->input_primitive = gs->info.properties[i].data[0]; > - else if (gs->info.properties[i].name == > - TGSI_PROPERTY_GS_OUTPUT_PRIM) > - gs->output_primitive = gs->info.properties[i].data[0]; > - else if (gs->info.properties[i].name == > - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) > - gs->max_output_vertices = gs->info.properties[i].data[0]; > - } > + gs->input_primitive = > + gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0]; > + gs->output_primitive = > + gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0]; > + gs->max_output_vertices = > + gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; > + if (!gs->max_output_vertices) > + gs->max_output_vertices = 32; > + > /* Primitive boundary is bigger than max_output_vertices by one, because > * the specification says that the geometry shader should exit if the > * number of emitted vertices is bigger or equal to max_output_vertices > and > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > index c0bd7be..2d7f32d 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > @@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, > * were forgetting so we're using MAX_VERTEX_VARYING from > * that spec even though we could debug_assert if it's not > * set, but that's a lot uglier. */ > - uint max_output_vertices = 32; > - uint i = 0; > + uint max_output_vertices; > + >/* inputs are always indirect with gs */ >bld.indirect_files |= (1 << TGSI_FILE_INPUT); >bld.gs_iface = gs_iface; > @@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, >bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex; >bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive; > > - for (i = 0; i < info->num_properties; ++i) { > - if (info->properties[i].name == > - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) { > -max_output_vertices = info->properties[i].data[0]; > - } > - } > + max_output_vertices = > +info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; > + if (!max_output_vertices) > + max_output_vertices = 32; > + >bld.max_output_vertices_vec = > lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type, > max_output_vertices); > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c > b/src/gallium/auxiliary/tgsi/tgsi_scan.c > index c71bb36..f9d1896 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c > @@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens, > { > const struct tgsi_full_property *fullprop > = &parse.FullToken.FullProperty; > +unsigned name = fullprop->Property.PropertyName; > > -info->properties[info->num_properties].name = > - fullprop->Property.PropertyName; > -memcpy(info->properties[info->num_properties].data, > - fullprop->u, 8 * sizeof(unsigned));; > - > -++info->num_
Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6
On 09/30/2014 01:00 PM, Mathias Fröhlich wrote: Jose, On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote: That said, the way we use these things are still a bit in flux. Mathias has some pending patches. BTW, Mathis, should I submit your patches for making llvmpipe thread safe? Mesa day for me. I did double check the mesa compile with different llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope to have caught all configs), which did only require marginal rebase changes. That's what I pushed finally. My linux build is broken: In file included from draw/draw_context.c:49:0: ./gallivm/lp_bld_init.h:47:4: error: unknown type name 'LLVMMCJITMemoryManagerRef' $ llvm-config --version 3.2 Yeah, it's a bit old, but it was working until now. Some of our other automated builds are failing too, probably with newer LLVM versions... Let me investigate. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6
On 09/30/2014 01:16 PM, Brian Paul wrote: On 09/30/2014 01:00 PM, Mathias Fröhlich wrote: Jose, On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote: That said, the way we use these things are still a bit in flux. Mathias has some pending patches. BTW, Mathis, should I submit your patches for making llvmpipe thread safe? Mesa day for me. I did double check the mesa compile with different llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope to have caught all configs), which did only require marginal rebase changes. That's what I pushed finally. My linux build is broken: In file included from draw/draw_context.c:49:0: ./gallivm/lp_bld_init.h:47:4: error: unknown type name 'LLVMMCJITMemoryManagerRef' $ llvm-config --version 3.2 Yeah, it's a bit old, but it was working until now. Some of our other automated builds are failing too, probably with newer LLVM versions... Let me investigate. Same failure with LLVM 3.3.1 too. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6
Hi, On Tuesday, September 30, 2014 13:17:31 Brian Paul wrote: > Same failure with LLVM 3.3.1 too. Ok, that's what I did not try. Sorry. I will try to followup immediately ... Greetings Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6
On 09/30/2014 01:26 PM, Mathias Fröhlich wrote: Hi, On Tuesday, September 30, 2014 13:17:31 Brian Paul wrote: Same failure with LLVM 3.3.1 too. Ok, that's what I did not try. Sorry. I will try to followup immediately ... Thanks, Mathias. But I'm about to post a patch that fixes things for LLVM 3.2 for me... Let me know what you think. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h. If we don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid excessive #ifdef testing elsewhere. --- src/gallium/auxiliary/gallivm/lp_bld.h | 40 +++ src/gallium/auxiliary/gallivm/lp_bld_init.c | 33 +- 2 files changed, 41 insertions(+), 32 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h index fcf4f16..3d156e8 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld.h +++ b/src/gallium/auxiliary/gallivm/lp_bld.h @@ -58,6 +58,46 @@ #endif +/* Only MCJIT is available as of LLVM SVN r216982 */ +#if HAVE_LLVM >= 0x0306 + +#define USE_MCJIT 1 +#define HAVE_AVX 1 + +#else + +/** + * AVX is supported in: + * - standard JIT from LLVM 3.2 onwards + * - MC-JIT from LLVM 3.1 + * - MC-JIT supports limited OSes (MacOSX and Linux) + * - standard JIT in LLVM 3.1, with backports + */ +#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64) +# define USE_MCJIT 1 +# define HAVE_AVX 0 +#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT)) +# define USE_MCJIT 0 +# define HAVE_AVX 1 +#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE)) +# define USE_MCJIT 1 +# define HAVE_AVX 1 +#else +# define USE_MCJIT 0 +# define HAVE_AVX 0 +#endif + +#endif /* HAVE_LLVM >= 0x0306 */ + + +#if !USE_MCJIT +/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy + * typedef to simplify things elsewhere. + */ +typedef void *LLVMMCJITMemoryManagerRef; +#endif + + /** * Redefine these LLVM entrypoints as invalid macros to make sure we * don't accidentally use them. We need to use the functions which diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 4e4aecb..3be14c2 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -43,37 +43,6 @@ #include -/* Only MCJIT is available as of LLVM SVN r216982 */ -#if HAVE_LLVM >= 0x0306 - -#define USE_MCJIT 1 -#define HAVE_AVX 1 - -#else - -/** - * AVX is supported in: - * - standard JIT from LLVM 3.2 onwards - * - MC-JIT from LLVM 3.1 - * - MC-JIT supports limited OSes (MacOSX and Linux) - * - standard JIT in LLVM 3.1, with backports - */ -#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64) -# define USE_MCJIT 1 -# define HAVE_AVX 0 -#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT)) -# define USE_MCJIT 0 -# define HAVE_AVX 1 -#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE)) -# define USE_MCJIT 1 -# define HAVE_AVX 1 -#else -# define USE_MCJIT 0 -# define HAVE_AVX 0 -#endif - -#endif /* HAVE_LLVM >= 0x0306 */ - #if USE_MCJIT void LLVMLinkInMCJIT(); #endif @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm) assert(!gallivm->engine); lp_free_generated_code(gallivm->code); gallivm->code = NULL; -#if HAVE_LLVM < 0x0306 +#if USE_MCJIT LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); gallivm->memorymgr = NULL; #endif -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.
On Tue, Sep 30, 2014 at 1:15 AM, Kenneth Graunke wrote: > Write-back caching cannot be used for buffers being scanned out by the > display engine; surfaces used for scan-out must be write-through or > uncached. I originally chose WT for render targets because it works in > all cases. However, we really want to use write-back caching where > possible, as it is more efficient. > > Most renderbuffers are not used for scanout - off-screen FBOs certainly > are fine, and non-pageflipped backbuffers should be fine as well. So > in most cases WB will work. However, we don't know what will be used > for scan-out, so we instead simply use the PTE value specified by the > kernel, as it knows these things. > > This matches our MOCS choice on Haswell. > > Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5 > in a microbenchmark (spotted by Eero Tamminen). Improves performance > in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a > Broadwell GT2. > > Signed-off-by: Kenneth Graunke > Reported-by: Eero Tamminen > Cc: mesa-sta...@lists.freedesktop.org That makes sense, good find from Eero. I'll update the SKL MOCS accordingly. Reviewed-by: Kristian Høgsberg > src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Cc'd to stable because it's a pretty trivial change and provides a sizable > boost to performance on new hardware. > > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c > b/src/mesa/drivers/dri/i965/gen8_surface_state.c > index 40eb2ea..6dd343f 100644 > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c > @@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, > horizontal_alignment(mt) | > surface_tiling_mode(tiling); > > - surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; > + surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2; > > surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) | > SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT); > -- > 2.1.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info
On Tue, Sep 30, 2014 at 9:04 PM, Roland Scheidegger wrote: > Am 30.09.2014 18:46, schrieb Marek Olšák: >> From: Marek Olšák >> >> Use an array of properties indexed by TGSI_PROPERTY_* definitions. >> --- >> src/gallium/auxiliary/draw/draw_gs.c | 23 - >> src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 15 +++--- >> src/gallium/auxiliary/tgsi/tgsi_scan.c | 59 >> ++-- >> src/gallium/auxiliary/tgsi/tgsi_scan.h | 6 +-- >> src/gallium/auxiliary/util/u_pstipple.c | 8 +--- >> src/gallium/drivers/llvmpipe/lp_state_fs.c | 10 +--- >> src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++--- >> src/gallium/drivers/r300/r300_fs.c | 8 +--- >> src/gallium/drivers/radeonsi/si_shader.c | 53 +++-- >> 9 files changed, 70 insertions(+), 136 deletions(-) >> >> diff --git a/src/gallium/auxiliary/draw/draw_gs.c >> b/src/gallium/auxiliary/draw/draw_gs.c >> index 878fcca..0c2f892 100644 >> --- a/src/gallium/auxiliary/draw/draw_gs.c >> +++ b/src/gallium/auxiliary/draw/draw_gs.c >> @@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw, >> tgsi_scan_shader(state->tokens, &gs->info); >> >> /* setup the defaults */ >> - gs->input_primitive = PIPE_PRIM_TRIANGLES; >> - gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP; >> - gs->max_output_vertices = 32; >> gs->max_out_prims = 0; >> >> #ifdef HAVE_LLVM >> @@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw, >>gs->vector_length = 1; >> } >> >> - for (i = 0; i < gs->info.num_properties; ++i) { >> - if (gs->info.properties[i].name == >> - TGSI_PROPERTY_GS_INPUT_PRIM) >> - gs->input_primitive = gs->info.properties[i].data[0]; >> - else if (gs->info.properties[i].name == >> - TGSI_PROPERTY_GS_OUTPUT_PRIM) >> - gs->output_primitive = gs->info.properties[i].data[0]; >> - else if (gs->info.properties[i].name == >> - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) >> - gs->max_output_vertices = gs->info.properties[i].data[0]; >> - } >> + gs->input_primitive = >> + gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0]; >> + gs->output_primitive = >> + gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0]; >> + gs->max_output_vertices = >> + gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; >> + if (!gs->max_output_vertices) >> + gs->max_output_vertices = 32; >> + >> /* Primitive boundary is bigger than max_output_vertices by one, because >> * the specification says that the geometry shader should exit if the >> * number of emitted vertices is bigger or equal to max_output_vertices >> and >> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c >> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c >> index c0bd7be..2d7f32d 100644 >> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c >> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c >> @@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, >> * were forgetting so we're using MAX_VERTEX_VARYING from >> * that spec even though we could debug_assert if it's not >> * set, but that's a lot uglier. */ >> - uint max_output_vertices = 32; >> - uint i = 0; >> + uint max_output_vertices; >> + >>/* inputs are always indirect with gs */ >>bld.indirect_files |= (1 << TGSI_FILE_INPUT); >>bld.gs_iface = gs_iface; >> @@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, >>bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex; >>bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive; >> >> - for (i = 0; i < info->num_properties; ++i) { >> - if (info->properties[i].name == >> - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) { >> -max_output_vertices = info->properties[i].data[0]; >> - } >> - } >> + max_output_vertices = >> +info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0]; >> + if (!max_output_vertices) >> + max_output_vertices = 32; >> + >>bld.max_output_vertices_vec = >> lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type, >> max_output_vertices); >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c >> b/src/gallium/auxiliary/tgsi/tgsi_scan.c >> index c71bb36..f9d1896 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c >> @@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens, >> { >> const struct tgsi_full_property *fullprop >> = &parse.FullToken.FullProperty; >> +unsigned name = fullprop->Property.PropertyName; >> >> -info->properties[info->num_properties].name = >> - fullprop->Property.PropertyName; >>
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Hi Brian, On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote: > Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h. If we > don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid > excessive #ifdef testing elsewhere. [...] > @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm) > assert(!gallivm->engine); > lp_free_generated_code(gallivm->code); > gallivm->code = NULL; > -#if HAVE_LLVM < 0x0306 > +#if USE_MCJIT We will probably still need the < 0x0306 check: #if HAVE_LLVM < 0x0306 && USE_MCJIT since this memorymanager stuff just vanished in the way 3.5 implemented this with version 3.6. > LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); > gallivm->memorymgr = NULL; > #endif > Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager call under some configurations with MCJIT and older llvm. So, additionally to what you had, how about the attached one? I am still trying to verify this change against 3.5 and 3.6. I am not sure about 3.2 since it did not build out of the box with my configure line. Greetings Mathias>From 915222ba9ed262d4c8deeafd2bfd530bb0a769ba Mon Sep 17 00:00:00 2001 Message-Id: <915222ba9ed262d4c8deeafd2bfd530bb0a769ba.1412109598.git.mathias.froehl...@gmx.net> From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= Date: Tue, 30 Sep 2014 22:11:30 +0200 Subject: [PATCH] gallivm: fix build for LLVM 3.2 --- src/gallium/auxiliary/gallivm/lp_bld.h| 40 +++ src/gallium/auxiliary/gallivm/lp_bld_init.c | 37 ++--- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 9 ++ src/gallium/auxiliary/gallivm/lp_bld_misc.h | 3 ++ 4 files changed, 54 insertions(+), 35 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h index fcf4f16..218f537 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld.h +++ b/src/gallium/auxiliary/gallivm/lp_bld.h @@ -58,6 +58,46 @@ #endif +/* Only MCJIT is available as of LLVM SVN r216982 */ +#if HAVE_LLVM >= 0x0306 + +#define USE_MCJIT 1 +#define HAVE_AVX 1 + +#else + +/** + * AVX is supported in: + * - standard JIT from LLVM 3.2 onwards + * - MC-JIT from LLVM 3.1 + * - MC-JIT supports limited OSes (MacOSX and Linux) + * - standard JIT in LLVM 3.1, with backports + */ +#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64) +# define USE_MCJIT 1 +# define HAVE_AVX 0 +#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT)) +# define USE_MCJIT 0 +# define HAVE_AVX 1 +#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE)) +# define USE_MCJIT 1 +# define HAVE_AVX 1 +#else +# define USE_MCJIT 0 +# define HAVE_AVX 0 +#endif + +#endif /* HAVE_LLVM >= 0x0306 */ + + +#if HAVE_LLVM <= 0x0303 +/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy + * typedef to simplify things elsewhere. + */ +typedef void *LLVMMCJITMemoryManagerRef; +#endif + + /** * Redefine these LLVM entrypoints as invalid macros to make sure we * don't accidentally use them. We need to use the functions which diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 4e4aecb..2f5b4ba 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -43,37 +43,6 @@ #include -/* Only MCJIT is available as of LLVM SVN r216982 */ -#if HAVE_LLVM >= 0x0306 - -#define USE_MCJIT 1 -#define HAVE_AVX 1 - -#else - -/** - * AVX is supported in: - * - standard JIT from LLVM 3.2 onwards - * - MC-JIT from LLVM 3.1 - * - MC-JIT supports limited OSes (MacOSX and Linux) - * - standard JIT in LLVM 3.1, with backports - */ -#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64) -# define USE_MCJIT 1 -# define HAVE_AVX 0 -#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT)) -# define USE_MCJIT 0 -# define HAVE_AVX 1 -#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE)) -# define USE_MCJIT 1 -# define HAVE_AVX 1 -#else -# define USE_MCJIT 0 -# define HAVE_AVX 0 -#endif - -#endif /* HAVE_LLVM >= 0x0306 */ - #if USE_MCJIT void LLVMLinkInMCJIT(); #endif @@ -219,10 +188,8 @@ gallivm_free_code(struct gallivm_state *gallivm) assert(!gallivm->engine); lp_free_generated_code(gallivm->code); gallivm->code = NULL; -#if HAVE_LLVM < 0x0306 - LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); + lp_free_memory_manager(gallivm->memorymgr); gallivm->memorymgr = NULL; -#endif } @@ -317,7 +284,7 @@ init_gallivm_state(struct gallivm_state *gallivm, const char *name, if (!gallivm->builder) goto fail; -#if HAVE_LLVM < 0x0306 +#if USE_MCJIT && HAVE_LLVM < 0x0306 gallivm->memorymgr = lp_get_default_memory_manager(); if (!gallivm->memorymgr) goto fail
Re: [Mesa-dev] [PATCH 2/2] galahad: fix indirect draw
Series looks good. Thanks for looking into this Roland. It looks nobody else is using galahad, nor looking at the warnings. I wonder if it makes sense to keep using/updating it. Jose From: srol...@vmware.com Sent: 30 September 2014 19:07 To: Jose Fonseca; mesa-dev@lists.freedesktop.org Cc: Roland Scheidegger Subject: [PATCH 2/2] galahad: fix indirect draw From: Roland Scheidegger Need to unwrap the indirect resource otherwise bad things will happen. Fixes random crashes and timeouts with piglit's arb_indirect_draw tests. --- src/gallium/drivers/galahad/glhd_context.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/galahad/glhd_context.c b/src/gallium/drivers/galahad/glhd_context.c index 79d5495..37ea170 100644 --- a/src/gallium/drivers/galahad/glhd_context.c +++ b/src/gallium/drivers/galahad/glhd_context.c @@ -49,7 +49,7 @@ galahad_context_destroy(struct pipe_context *_pipe) static void galahad_context_draw_vbo(struct pipe_context *_pipe, - const struct pipe_draw_info *info) + const struct pipe_draw_info *info) { struct galahad_context *glhd_pipe = galahad_context(_pipe); struct pipe_context *pipe = glhd_pipe->pipe; @@ -58,7 +58,14 @@ galahad_context_draw_vbo(struct pipe_context *_pipe, * before drawing. */ - pipe->draw_vbo(pipe, info); + if (info->indirect) { + struct pipe_draw_info info_unwrapped = *info; + info_unwrapped.indirect = galahad_resource_unwrap(info->indirect); + pipe->draw_vbo(pipe, &info_unwrapped); + } + else { + pipe->draw_vbo(pipe, info); + } } static struct pipe_query * -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
On 09/30/2014 02:40 PM, Mathias Fröhlich wrote: Hi Brian, On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote: Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h. If we don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid excessive #ifdef testing elsewhere. [...] @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm) assert(!gallivm->engine); lp_free_generated_code(gallivm->code); gallivm->code = NULL; -#if HAVE_LLVM < 0x0306 +#if USE_MCJIT We will probably still need the < 0x0306 check: #if HAVE_LLVM < 0x0306 && USE_MCJIT since this memorymanager stuff just vanished in the way 3.5 implemented this with version 3.6. LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); gallivm->memorymgr = NULL; #endif Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager call under some configurations with MCJIT and older llvm. So, additionally to what you had, how about the attached one? I am still trying to verify this change against 3.5 and 3.6. I am not sure about 3.2 since it did not build out of the box with my configure line. It compiles, but I get a segfault when I try to run anything: Program received signal SIGSEGV, Segmentation fault. 0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable (this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165 165 mgr()->setMemoryWritable(); (gdb) where #0 0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable (this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165 #1 0x751579cc in (anonymous namespace)::JITEmitter::startFunction (this=0x727b20, F=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JITEmitter.cpp:782 #2 0x7567e595 in (anonymous namespace)::Emitter::runOnMachineFunction (this=0x75dd50, MF=...) at /build/buildd/llvm-3.2-3.2/lib/Target/X86/X86CodeEmitter.cpp:145 #3 0x7501bbbf in runOnFunction (F=..., this=0x727980) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1498 #4 llvm::FPPassManager::runOnFunction (this=0x727980, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1476 #5 0x7501f2bb in llvm::FunctionPassManagerImpl::run (this=0x6fc950, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1449 #6 0x7501f396 in llvm::FunctionPassManager::run (this=0x6f5d20, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1379 #7 0x7514e637 in llvm::JIT::jitTheFunction (this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:645 #8 0x7514ec2f in llvm::JIT::runJITOnFunctionUnlocked (this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:624 #9 0x7514ed89 in llvm::JIT::getPointerToFunction (this=0x6fc800, F=0x769720) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:681 #10 0x77941c04 in gallivm_jit_function (gallivm=0x6f5a90, func=0x769720) at gallivm/lp_bld_init.c:586 #11 0x779b54d2 in generate_variant (lp=0x61f750, shader=0x6fdd10, key=0x7fffd9a0) at lp_state_fs.c:2634 #12 0x779b6a77 in llvmpipe_update_fs (lp=0x61f750) at lp_state_fs.c:3166 #13 0x779ac7bb in llvmpipe_update_derived (llvmpipe=0x61f750) at lp_state_derived.c:186 #14 0x77984562 in llvmpipe_draw_vbo (pipe=0x61f750, info=0x7fffdcc0) at lp_draw_arrays.c:70 #15 0x7785c1d3 in cso_draw_vbo (cso=0x6b9260, info=0x7fffdcc0) at cso_cache/cso_context.c:1418 #16 0x7771a373 in st_draw_vbo (ctx=0x77ec4010, prims=0x6ab7c0, nr_prims=2, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=161, tfb_vertcount=0x0, indirect=0x0) at ../../src/mesa/state_tracker/st_draw.c:285 #17 0x776f7e3f in vbo_save_playback_vertex_list (ctx=0x77ec4010, data=0x6ab3ec) at ../../src/mesa/vbo/vbo_save_draw.c:310 #18 0x77524e18 in ext_opcode_execute (ctx=0x77ec4010, node=0x6ab3e8) at ../../src/mesa/main/dlist.c:658 #19 0x7753b5db in execute_list (ctx=0x77ec4010, list=1) at ../../src/mesa/main/dlist.c:7692 #20 0x77541f2b in _mesa_CallList (list=1) at ../../src/mesa/main/dlist.c:9121 #21 0x00402e2d in draw () at gears.c:196 #22 0x76ee1376 in processWindowWorkList (window=0x61b180) at glut_event.c:1307 #23 0x76ee232c in __glutProcessWindowWorkLists () at glut_event.c:1358 #24 glutMainLoop () at glut_event.c:1379 #25 0x004036ea in main (argc=1, argv=0x7fffe658) at gears.c:405 -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits
On 09/30/2014 11:24 AM, Matt Turner wrote: > On Tue, Sep 30, 2014 at 8:50 AM, Ian Romanick wrote: >> On 09/20/2014 07:41 PM, Matt Turner wrote: >>> On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes wrote: diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 79d2e94..c11ad4f 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void) void _mesa_init_shader_state(struct gl_context *ctx) { + int i; >>> >>> In context, this declaration looks odd. Move it below the two just >>> after this hunk? >> >> Not in core Mesa where we have to do dumb ol' C89. :( > > Move it after the other two variable declarations... > >/* Device drivers may override these to control what kind of instructions > * are generated by the GLSL compiler. > */ >struct gl_shader_compiler_options options; >gl_shader_stage sh; Oh... yeah, that's fine. I misunderstood you before. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.
On Sat, Sep 27, 2014 at 12:12 PM, Matt Turner wrote: > The next patch adds an algebraic optimization for the pattern > >sqrt a, b >rcp c, a > > and turns it into > >sqrt a, b >rsq c, b > > but many vertex shaders do > >a = sqrt(b); >var1 /= a; >var2 /= a; > > which generates > >sqrt a, b >rcp c, a >rcp d, a > > If we apply the algebraic optimization before CSE, we'll end up with > >sqrt a, b >rsq c, b >rcp d, a > > Applying CSE combines the RCP instructions, preventing this from > happening. > > No shader-db changes. > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 022ed37..e0a3d5f 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1790,8 +1790,8 @@ vec4_visitor::run() >OPT(dead_code_eliminate); >OPT(dead_control_flow_eliminate, this); >OPT(opt_copy_propagation); > - OPT(opt_algebraic); >OPT(opt_cse); > + OPT(opt_algebraic); >OPT(opt_register_coalesce); > } while (progress); > > -- > 1.8.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev For the series: Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.
On 09/27/2014 12:12 PM, Matt Turner wrote: > The next patch adds an algebraic optimization for the pattern > >sqrt a, b >rcp c, a > > and turns it into > >sqrt a, b >rsq c, b > > but many vertex shaders do > >a = sqrt(b); >var1 /= a; >var2 /= a; > > which generates > >sqrt a, b >rcp c, a >rcp d, a > > If we apply the algebraic optimization before CSE, we'll end up with > >sqrt a, b >rsq c, b >rcp d, a Why doesn't a second pass through opt_algebraic turn this into rsq c, b rsq d, b Seems like this could cause us to miss other optimization opportunities... > Applying CSE combines the RCP instructions, preventing this from > happening. > > No shader-db changes. > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 022ed37..e0a3d5f 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1790,8 +1790,8 @@ vec4_visitor::run() >OPT(dead_code_eliminate); >OPT(dead_control_flow_eliminate, this); >OPT(opt_copy_propagation); > - OPT(opt_algebraic); >OPT(opt_cse); > + OPT(opt_algebraic); >OPT(opt_register_coalesce); > } while (progress); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.
On Tue, Sep 30, 2014 at 2:10 PM, Ian Romanick wrote: > On 09/27/2014 12:12 PM, Matt Turner wrote: >> The next patch adds an algebraic optimization for the pattern >> >>sqrt a, b >>rcp c, a >> >> and turns it into >> >>sqrt a, b >>rsq c, b >> >> but many vertex shaders do >> >>a = sqrt(b); >>var1 /= a; >>var2 /= a; >> >> which generates >> >>sqrt a, b >>rcp c, a >>rcp d, a >> >> If we apply the algebraic optimization before CSE, we'll end up with >> >>sqrt a, b >>rsq c, b >>rcp d, a > > Why doesn't a second pass through opt_algebraic turn this into Because the addition in patch #2 just recognizes a consecutive sqrt+rcp pattern. >rsq c, b >rsq d, b > > Seems like this could cause us to miss other optimization opportunities... This seems pretty sufficient for the collection of shaders in shader-db -- no regressions, cuts vec4 instructions, and handles 410 sqrt+rcp pairs. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965/fs: Extend predicated break pass to predicate WHILE.
On Mon, Sep 8, 2014 at 12:21 PM, Matt Turner wrote: > Helps a handful of programs in Serious Sam 3 that use do-while loops. > > instructions in affected programs: 16114 -> 16075 (-0.24%) > --- How about a review? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7.
On 09/28/2014 01:26 PM, Matt Turner wrote: > --- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 22 -- > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > index e1f5735..e6c34fa 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > @@ -634,14 +634,24 @@ fs_visitor::visit(ir_expression *ir) > else > emit(MUL(this->result, op[0], op[1])); > } else { > -if (brw->gen >= 7) > - no16("SIMD16 explicit accumulator operands unsupported\n"); > - > struct brw_reg acc = retype(brw_acc_reg(), this->result.type); > > -emit(MUL(acc, op[0], op[1])); > -emit(MACH(reg_null_d, op[0], op[1])); > -emit(MOV(this->result, fs_reg(acc))); > +fs_inst *mul = emit(MUL(acc, op[0], op[1])); > +fs_inst *mach = emit(MACH(reg_null_d, op[0], op[1])); > +fs_inst *mov = emit(MOV(this->result, fs_reg(acc))); > + > +if (brw->gen == 7 && dispatch_width == 16) { > + mul->force_uncompressed = true; > + mach->force_uncompressed = true; > + mov->force_uncompressed = true; > + > + mul = emit(MUL(acc, half(op[0], 1), half(op[1], 1))); > + mul->force_sechalf = true; > + mach = emit(MACH(reg_null_d, half(op[0], 1), half(op[1], 1))); > + mach->force_sechalf = true; > + mov = emit(MOV(half(this->result, 1), fs_reg(acc))); > + mov->force_sechalf = true; > +} Are there a bunch of cases where we double emit things for SIMD16? Would it make more sense to have a generic function that took a list of instructions, duplicated them, and did the force_uncompressed / force_sechalf modification? > } >} else { >emit(MUL(this->result, op[0], op[1])); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Brian, Your patch looks good AFAICT. Not sure why the crash, and I'm afraid I won't have time to look into it. I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for all LLVM versions . We were avoiding it on old LLVM versions, but AFAICT there's no longer any reason to avoid it now, and it might simplify get things working again. If things still don't work, then I think we should revert the recent LLVM changes, move them into a branch so we can investigate the issues with old LLVM more carefuly without blocking builds/tests on master. Jose From: Brian Paul Sent: 30 September 2014 21:47 To: Mathias Fröhlich; mesa-dev@lists.freedesktop.org Cc: Jose Fonseca Subject: Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2 On 09/30/2014 02:40 PM, Mathias Fröhlich wrote: > > Hi Brian, > > On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote: >> Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h. If we >> don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid >> excessive #ifdef testing elsewhere. > [...] >> @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm) >> assert(!gallivm->engine); >> lp_free_generated_code(gallivm->code); >> gallivm->code = NULL; >> -#if HAVE_LLVM < 0x0306 >> +#if USE_MCJIT > We will probably still need the < 0x0306 check: > #if HAVE_LLVM < 0x0306 && USE_MCJIT > since this memorymanager stuff just vanished in the way 3.5 implemented > this with version 3.6. > >> LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); >> gallivm->memorymgr = NULL; >> #endif >> > > Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager > call under some configurations with MCJIT and older llvm. > So, additionally to what you had, how about the attached one? > I am still trying to verify this change against 3.5 and 3.6. > I am not sure about 3.2 since it did not build out of the box with > my configure line. It compiles, but I get a segfault when I try to run anything: Program received signal SIGSEGV, Segmentation fault. 0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable (this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165 165 mgr()->setMemoryWritable(); (gdb) where #0 0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable (this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165 #1 0x751579cc in (anonymous namespace)::JITEmitter::startFunction (this=0x727b20, F=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JITEmitter.cpp:782 #2 0x7567e595 in (anonymous namespace)::Emitter::runOnMachineFunction (this=0x75dd50, MF=...) at /build/buildd/llvm-3.2-3.2/lib/Target/X86/X86CodeEmitter.cpp:145 #3 0x7501bbbf in runOnFunction (F=..., this=0x727980) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1498 #4 llvm::FPPassManager::runOnFunction (this=0x727980, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1476 #5 0x7501f2bb in llvm::FunctionPassManagerImpl::run (this=0x6fc950, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1449 #6 0x7501f396 in llvm::FunctionPassManager::run (this=0x6f5d20, F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1379 #7 0x7514e637 in llvm::JIT::jitTheFunction (this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:645 #8 0x7514ec2f in llvm::JIT::runJITOnFunctionUnlocked (this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:624 #9 0x7514ed89 in llvm::JIT::getPointerToFunction (this=0x6fc800, F=0x769720) at /build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:681 #10 0x77941c04 in gallivm_jit_function (gallivm=0x6f5a90, func=0x769720) at gallivm/lp_bld_init.c:586 #11 0x779b54d2 in generate_variant (lp=0x61f750, shader=0x6fdd10, key=0x7fffd9a0) at lp_state_fs.c:2634 #12 0x779b6a77 in llvmpipe_update_fs (lp=0x61f750) at lp_state_fs.c:3166 #13 0x779ac7bb in llvmpipe_update_derived (llvmpipe=0x61f750) at lp_state_derived.c:186 #14 0x77984562 in llvmpipe_draw_vbo (pipe=0x61f750, info=0x7fffdcc0) at lp_draw_arrays.c:70 #15 0x7785c1d3 in cso_draw_vbo (cso=0x6b9260, info=0x7fffdcc0) at cso_cache/cso_context.c:1418 #16 0x7771a373 in st_draw_vbo (ctx=0x77ec4010, prims=0x6ab7c0, nr_prims=2, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=161, tfb_vertcount=0x0, indirect=0x0) at ../../src/mesa/state_tracker/st_draw.c:285 #17 0x776f7e3f in vbo_save_playback_vertex_list (ctx=0x77ec4010, data=0x6ab3ec) at ../../src/mesa/vbo/vbo_save_draw.c:310 #18 0x77524e18 in ext_opcode_execute (ctx=0x77ec4010, node=0x6ab3e8) at ../../src/mesa/main/dlist.c:658 #19 0x7753b5db in execute_list (ctx=0x77ec4010, list=1) at ../../src/mesa/main/dli
Re: [Mesa-dev] [PATCH 0/6] i965/fs: ARB_gpu_shader5 operations SIMD16 support
The first 3 are Reviewed-by: Ian Romanick I sent a question on patch 4 that may affect it and the others. On 09/28/2014 01:26 PM, Matt Turner wrote: > [PATCH 1/6] i965/fs: Set MUL source type to W/UW in 64-bit mul macro > >Fixes 64-bit multiploes on Gen8. > > [PATCH 2/6] i965/fs: Don't offset uniform registers in half(). > >Bug fix necessary for later patches. > > [PATCH 3/6] i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > >Don't apply Gen7 restrictions to Gen8. > > [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7. > [PATCH 5/6] i965/fs: Implement SIMD16 64-bit integer multiplies on Gen 7. > [PATCH 6/6] i965/fs: Implement SIMD16 carry/borrow on Gen 7. > >Implements SIMD16 operations on Gen7. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.
On 09/30/2014 02:16 PM, Matt Turner wrote: > On Tue, Sep 30, 2014 at 2:10 PM, Ian Romanick wrote: >> On 09/27/2014 12:12 PM, Matt Turner wrote: >>> The next patch adds an algebraic optimization for the pattern >>> >>>sqrt a, b >>>rcp c, a >>> >>> and turns it into >>> >>>sqrt a, b >>>rsq c, b >>> >>> but many vertex shaders do >>> >>>a = sqrt(b); >>>var1 /= a; >>>var2 /= a; >>> >>> which generates >>> >>>sqrt a, b >>>rcp c, a >>>rcp d, a >>> >>> If we apply the algebraic optimization before CSE, we'll end up with >>> >>>sqrt a, b >>>rsq c, b >>>rcp d, a >> >> Why doesn't a second pass through opt_algebraic turn this into > > Because the addition in patch #2 just recognizes a consecutive sqrt+rcp > pattern. That makes sense. Series is Reviewed-by: Ian Romanick >>rsq c, b >>rsq d, b >> >> Seems like this could cause us to miss other optimization opportunities... > > This seems pretty sufficient for the collection of shaders in > shader-db -- no regressions, cuts vec4 instructions, and handles 410 > sqrt+rcp pairs. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965/fs: Extend predicated break pass to predicate WHILE.
On 09/25/2014 09:00 AM, Matt Turner wrote: > On Thu, Sep 25, 2014 at 8:25 AM, Ian Romanick wrote: >> How did you test this? Do we have piglit execution tests that actually >> hit this path? I'm assuming you didn't play Serious Sam 3 looking for >> rendering errors. ;) > > I wrote the patch and initially missed the necessary predicate_inverse > bit, and saw a bunch of piglit failures (hangs I think?). I checked > the docs again and realized I needed to flip it, and then all of > piglit passed. Heh... okay. Reviewed-by: Ian Romanick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Hi, On Tuesday, September 30, 2014 21:25:54 Jose Fonseca wrote: > Your patch looks good AFAICT. > > Not sure why the crash, and I'm afraid I won't have time to look into it. I am currently looking into that. > I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for > all LLVM versions . We were avoiding it on old LLVM versions, but AFAICT > there's no longer any reason to avoid it now, and it might simplify get > things working again. > > If things still don't work, then I think we should revert the recent LLVM > changes, move them into a branch so we can investigate the issues with old > LLVM more carefuly without blocking builds/tests on master. Reverting commit d90ff351f3a3598834f77b9c0723532b3abd3cd5 Author: Mathias Fröhlich Date: Sun Jul 13 12:49:41 2014 +0200 llvmpipe: Make a llvmpipe OpenGL context thread safe. is probably enough. I will just do so if I can't find the reason for the crash now. Greetings Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: add u_lowering
From: Rob Clark TGSI->TGSI pass, extracted from freedreno. Currently provides the following lower support, to help drivers emulate unsupported opcodes or features: Individual opcodes: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2, DP2A Also supported, although it is up to the driver to manage it's own shader variants: + two-sided-color + texture coord saturate (ie. to emulate GL_CLAMP) All of the lowering operations are opt-in so a driver can pick and choose what it wants. Signed-off-by: Rob Clark --- src/gallium/auxiliary/Makefile.sources |1 + src/gallium/auxiliary/util/u_lowering.c | 1571 +++ src/gallium/auxiliary/util/u_lowering.h | 87 ++ 3 files changed, 1659 insertions(+) create mode 100644 src/gallium/auxiliary/util/u_lowering.c create mode 100644 src/gallium/auxiliary/util/u_lowering.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 58d8af7..575c315 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -125,6 +125,7 @@ C_SOURCES := \ util/u_keymap.c \ util/u_linear.c \ util/u_linkage.c \ + util/u_lowering.c \ util/u_network.c \ util/u_math.c \ util/u_mm.c \ diff --git a/src/gallium/auxiliary/util/u_lowering.c b/src/gallium/auxiliary/util/u_lowering.c new file mode 100644 index 000..fd193bc --- /dev/null +++ b/src/gallium/auxiliary/util/u_lowering.c @@ -0,0 +1,1571 @@ +/* + * Copyright (C) 2014 Red Hat + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: + *Rob Clark + */ + +#include "tgsi/tgsi_transform.h" +#include "tgsi/tgsi_scan.h" +#include "tgsi/tgsi_dump.h" + +#include "util/u_debug.h" +#include "util/u_math.h" +#include "util/u_lowering.h" + +struct u_lowering_context { + struct tgsi_transform_context base; + const struct u_lowering_config *config; + struct tgsi_shader_info *info; + unsigned two_side_colors; + unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS]; + unsigned color_base; /* base register for chosen COLOR/BCOLOR's */ + int face_idx; + unsigned numtmp; + struct { + struct tgsi_full_src_register src; + struct tgsi_full_dst_register dst; + } tmp[2]; +#define A 0 +#define B 1 + struct tgsi_full_src_register imm; + int emitted_decls; + unsigned saturate; +}; + +static inline struct u_lowering_context * +u_lowering_context(struct tgsi_transform_context *tctx) +{ + return (struct u_lowering_context *) tctx; +} + +/* + * Utility helpers: + */ + +static void +reg_dst(struct tgsi_full_dst_register *dst, +const struct tgsi_full_dst_register *orig_dst, unsigned wrmask) +{ + *dst = *orig_dst; + dst->Register.WriteMask &= wrmask; + assert(dst->Register.WriteMask); +} + +static inline void +get_swiz(unsigned *swiz, const struct tgsi_src_register *src) +{ + swiz[0] = src->SwizzleX; + swiz[1] = src->SwizzleY; + swiz[2] = src->SwizzleZ; + swiz[3] = src->SwizzleW; +} + +static void +reg_src(struct tgsi_full_src_register *src, +const struct tgsi_full_src_register *orig_src, +unsigned sx, unsigned sy, unsigned sz, unsigned sw) +{ + unsigned swiz[4]; + get_swiz(swiz, &orig_src->Register); + *src = *orig_src; + src->Register.SwizzleX = swiz[sx]; + src->Register.SwizzleY = swiz[sy]; + src->Register.SwizzleZ = swiz[sz]; + src->Register.SwizzleW = swiz[sw]; +} + +#define TGSI_SWIZZLE__ TGSI_SWIZZLE_X /* don't-care value! */ +#define SWIZ(x,y,z,w) TGSI_SWIZZLE_ ## x, TGSI_SWIZZLE_ ## y, \ + TGSI_SWIZZLE_ ## z, TGSI_SWIZZLE_ ## w + +/* + * if (dst.x aliases src.x) { + * MOV tmpA.x, src.x + * src = tmpA + * } + * COS dst.x, src.x + * SIN dst.y, src.x + * MOV dst.zw, imm{0.0, 1.0} + */ +static bool +aliases(const struct tgsi_full_dst_register *dst, unsigned dst_mask, +const struct tgsi_
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
On 09/30/2014 03:34 PM, Mathias Fröhlich wrote: Hi, On Tuesday, September 30, 2014 21:25:54 Jose Fonseca wrote: Your patch looks good AFAICT. Not sure why the crash, and I'm afraid I won't have time to look into it. I am currently looking into that. I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for all LLVM versions . We were avoiding it on old LLVM versions, but AFAICT there's no longer any reason to avoid it now, and it might simplify get things working again. If things still don't work, then I think we should revert the recent LLVM changes, move them into a branch so we can investigate the issues with old LLVM more carefuly without blocking builds/tests on master. Reverting commit d90ff351f3a3598834f77b9c0723532b3abd3cd5 Author: Mathias Fröhlich Date: Sun Jul 13 12:49:41 2014 +0200 llvmpipe: Make a llvmpipe OpenGL context thread safe. is probably enough. I will just do so if I can't find the reason for the crash now. Yeah, reverting that patch clears up the regression here. Please go ahead and do the revert if you don't think you can solve the problem otherwise. Thanks! -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Hi Brian, On Tuesday, September 30, 2014 15:42:21 Brian Paul wrote: > Yeah, reverting that patch clears up the regression here. Please go > ahead and do the revert if you don't think you can solve the problem > otherwise. Thanks! I could even reproduce the segfault with the previous patch and 3.5, which is the one I am currently testing against. Does the attached patch - based on master - also fix your problems on 3.3? ... sorry, I am still iterating on a configuration that builds 3.2 or at least 3.3. Thanks for your patience! Mathias>From 321aecdcebd9844568985e1064d6679e04cf6e2a Mon Sep 17 00:00:00 2001 Message-Id: <321aecdcebd9844568985e1064d6679e04cf6e2a.1412113717.git.mathias.froehl...@gmx.net> From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= Date: Tue, 30 Sep 2014 22:11:30 +0200 Subject: [PATCH] gallivm: fix build for LLVM 3.2 --- src/gallium/auxiliary/gallivm/lp_bld.h| 8 src/gallium/auxiliary/gallivm/lp_bld_init.c | 4 +--- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 9 + src/gallium/auxiliary/gallivm/lp_bld_misc.h | 3 +++ 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h index fcf4f16..a01c216 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld.h +++ b/src/gallium/auxiliary/gallivm/lp_bld.h @@ -58,6 +58,14 @@ #endif +#if HAVE_LLVM <= 0x0303 +/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy + * typedef to simplify things elsewhere. + */ +typedef void *LLVMMCJITMemoryManagerRef; +#endif + + /** * Redefine these LLVM entrypoints as invalid macros to make sure we * don't accidentally use them. We need to use the functions which diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 4e4aecb..8d7a0b6 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -219,10 +219,8 @@ gallivm_free_code(struct gallivm_state *gallivm) assert(!gallivm->engine); lp_free_generated_code(gallivm->code); gallivm->code = NULL; -#if HAVE_LLVM < 0x0306 - LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); + lp_free_memory_manager(gallivm->memorymgr); gallivm->memorymgr = NULL; -#endif } diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index c173ab6..9c2de2c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -554,3 +554,12 @@ lp_get_default_memory_manager() return 0; #endif } + +extern "C" +void +lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr) +{ +#if HAVE_LLVM < 0x0306 + delete reinterpret_cast(memorymgr); +#endif +} diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h b/src/gallium/auxiliary/gallivm/lp_bld_misc.h index 40d3e79..36923aa 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h @@ -65,6 +65,9 @@ lp_free_generated_code(struct lp_generated_code *code); extern LLVMMCJITMemoryManagerRef lp_get_default_memory_manager(); +extern void +lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr); + #ifdef __cplusplus } #endif -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: add u_lowering
On 09/30/2014 03:38 PM, Rob Clark wrote: From: Rob Clark TGSI->TGSI pass, extracted from freedreno. Currently provides the following lower support, to help drivers emulate unsupported opcodes or features: Individual opcodes: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2, DP2A Also supported, although it is up to the driver to manage it's own shader variants: + two-sided-color + texture coord saturate (ie. to emulate GL_CLAMP) All of the lowering operations are opt-in so a driver can pick and choose what it wants. Signed-off-by: Rob Clark Hi Rob, a few thoughts: How about moving this into the src/gallium/auxiliary/tgsi/ directory since this is a very TGSI-specific thing? I think some of my recent changes to the tgsi_transform code would be helpful, like emit_epilog/prolog() and the various tgsi_transform_opX_inst() helpers. There's a lot of tricky code trying to determine the exact size of the new shader token buffer. Why not just use a 2x buffer and then realloc to the exact size at the end? I think the big if/switch statements could be replaced by some kind of table-driven system. FWIW, I have a two-sided lighting transformation function too that I haven't pushed to master yet. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7.
On Tue, Sep 30, 2014 at 2:26 PM, Ian Romanick wrote: > Are there a bunch of cases where we double emit things for SIMD16? > Would it make more sense to have a generic function that took a list of > instructions, duplicated them, and did the force_uncompressed / > force_sechalf modification? Not many. Other than these, the only other things are the 3-src instructions on SNB+IVB, and BFI instructions on Haswell. In those cases, we can just double emit instructions in the generator. These (addc, subb, integer multiplies) are weird and have to be handled in the visitor because they use the accumulator and on Gen7 the accumulator doesn't handle integer data in SIMD16. I'm going to have to rebase the last three on Jason's changes though, so I'll resend them. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: add u_lowering
Rob Clark writes: > From: Rob Clark > > TGSI->TGSI pass, extracted from freedreno. Currently provides the > following lower support, to help drivers emulate unsupported opcodes > or features: > > Individual opcodes: > DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, > DP2, DP2A > > Also supported, although it is up to the driver to manage it's own > shader variants: > + two-sided-color > + texture coord saturate (ie. to emulate GL_CLAMP) > > All of the lowering operations are opt-in so a driver can pick and > choose what it wants. This is very useful to me, as it got me +15 piglit tests, and -70 lines of code. I'm not using FRC because I can do it in just as many instructions as my FLR, and I'm not using LRP because I'm doing the (c + a * (b - c)) version of things, but I'm using all the rest of those opcodes. I'm not cloning the tokens for shader variants yet, so I'm still doing my own texcoord clamping. However, I do want to use the two-side-color lowering, so I'll probably start cloning for variants, at which point I'll use the texcoord bits, too. However, I'd like to see the code first have a commit that's just a raw copy, so that git log --follow can track where the code came from. I've pushed a branch called "robclark-lowering" to my mesa tree if maybe you want to ack that starting commit? pgp3v8ritM8qL.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Brian, at least here, I get a build that runs glxgears which previously did not run with 3.3, 3.5. Currently the compile test runs with 3.6. If this succeeds, ok to push the attached fix (The same than before but with a more descriptive commit message)? Greetings Mathias>From 39a8625423f85327eefdadd3d4068c9d3e26d936 Mon Sep 17 00:00:00 2001 Message-Id: <39a8625423f85327eefdadd3d4068c9d3e26d936.1412114843.git.mathias.froehl...@gmx.net> From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= Date: Tue, 30 Sep 2014 22:11:30 +0200 Subject: [PATCH] gallivm: Fix build for LLVM 3.2 Do not rely on LLVMMCJITMemoryManagerRef being available. The c binding to the memory manager objects only appeared on llvm-3.4. The change is based on an initial patch of Brian Paul. Signed-off-by: Mathias Froehlich --- src/gallium/auxiliary/gallivm/lp_bld.h| 8 src/gallium/auxiliary/gallivm/lp_bld_init.c | 4 +--- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 9 + src/gallium/auxiliary/gallivm/lp_bld_misc.h | 3 +++ 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h index fcf4f16..a01c216 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld.h +++ b/src/gallium/auxiliary/gallivm/lp_bld.h @@ -58,6 +58,14 @@ #endif +#if HAVE_LLVM <= 0x0303 +/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy + * typedef to simplify things elsewhere. + */ +typedef void *LLVMMCJITMemoryManagerRef; +#endif + + /** * Redefine these LLVM entrypoints as invalid macros to make sure we * don't accidentally use them. We need to use the functions which diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 4e4aecb..8d7a0b6 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -219,10 +219,8 @@ gallivm_free_code(struct gallivm_state *gallivm) assert(!gallivm->engine); lp_free_generated_code(gallivm->code); gallivm->code = NULL; -#if HAVE_LLVM < 0x0306 - LLVMDisposeMCJITMemoryManager(gallivm->memorymgr); + lp_free_memory_manager(gallivm->memorymgr); gallivm->memorymgr = NULL; -#endif } diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index c173ab6..9c2de2c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -554,3 +554,12 @@ lp_get_default_memory_manager() return 0; #endif } + +extern "C" +void +lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr) +{ +#if HAVE_LLVM < 0x0306 + delete reinterpret_cast(memorymgr); +#endif +} diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h b/src/gallium/auxiliary/gallivm/lp_bld_misc.h index 40d3e79..36923aa 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h @@ -65,6 +65,9 @@ lp_free_generated_code(struct lp_generated_code *code); extern LLVMMCJITMemoryManagerRef lp_get_default_memory_manager(); +extern void +lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr); + #ifdef __cplusplus } #endif -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
On 09/30/2014 04:17 PM, Mathias Fröhlich wrote: Brian, at least here, I get a build that runs glxgears which previously did not run with 3.3, 3.5. Currently the compile test runs with 3.6. If this succeeds, ok to push the attached fix (The same than before but with a more descriptive commit message)? OK, this patch seems to fix everything here. Thanks! Reviewed-by: Brian Paul Tested-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2
Am 01.10.2014 00:17, schrieb Mathias Fröhlich: > > Brian, > > at least here, I get a build that runs glxgears which > previously did not run with 3.3, 3.5. > Currently the compile test runs with 3.6. > If this succeeds, ok to push the attached fix > (The same than before but with a more descriptive commit message)? > > Greetings > Mathias > Looks good to me too. Reviewed-by: Roland Scheidegger ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: add u_lowering
Brian Paul writes: > On 09/30/2014 03:38 PM, Rob Clark wrote: >> From: Rob Clark >> >> TGSI->TGSI pass, extracted from freedreno. Currently provides the >> following lower support, to help drivers emulate unsupported opcodes >> or features: >> >> Individual opcodes: >>DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, >>DP2, DP2A >> >> Also supported, although it is up to the driver to manage it's own >> shader variants: >> + two-sided-color >> + texture coord saturate (ie. to emulate GL_CLAMP) >> >> All of the lowering operations are opt-in so a driver can pick and >> choose what it wants. >> >> Signed-off-by: Rob Clark > > Hi Rob, a few thoughts: > > How about moving this into the src/gallium/auxiliary/tgsi/ directory > since this is a very TGSI-specific thing? I happened to be writing a series in parallel with Rob to do the same thing, and I chose tgsi/. I've rebased my changes on his freedreno fix, and I'm going to send them out in this thread now. I don't really care which location wins, having good history is all I care about (Also, reviewing his changes, I found style improvements that I propagated to mine. So duplicated work ended up having some value). > I think some of my recent changes to the tgsi_transform code would be > helpful, like emit_epilog/prolog() and the various > tgsi_transform_opX_inst() helpers. > > There's a lot of tricky code trying to determine the exact size of the > new shader token buffer. Why not just use a 2x buffer and then realloc > to the exact size at the end? 2x isn't nearly the lowest growth factor, right? How massively overallocated do you go? All this calculation of array sizes up front is pretty awful, though -- it seems like token buffers ought to be just growing as you append tokens. But I guess that would be a change to do to tgsi_transform in general. > I think the big if/switch statements could be replaced by some kind of > table-driven system. pgpHr_xZioEba.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.
Lots of drivers need to transform the weird instructions in TGSI into reasonable scalar ops, and this code can make those translations canonical. --- src/gallium/auxiliary/tgsi/tgsi_lowering.c | 1573 src/gallium/auxiliary/tgsi/tgsi_lowering.h | 89 ++ 2 files changed, 1662 insertions(+) create mode 100644 src/gallium/auxiliary/tgsi/tgsi_lowering.c create mode 100644 src/gallium/auxiliary/tgsi/tgsi_lowering.h diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c b/src/gallium/auxiliary/tgsi/tgsi_lowering.c new file mode 100644 index 000..795b537 --- /dev/null +++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c @@ -0,0 +1,1573 @@ +/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */ + +/* + * Copyright (C) 2014 Rob Clark + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: + *Rob Clark + */ + +#include "tgsi/tgsi_transform.h" +#include "tgsi/tgsi_scan.h" +#include "tgsi/tgsi_dump.h" + +#include "util/u_debug.h" +#include "util/u_math.h" + +#include "freedreno_lowering.h" + +struct fd_lowering_context { + struct tgsi_transform_context base; + const struct fd_lowering_config *config; + struct tgsi_shader_info *info; + unsigned two_side_colors; + unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS]; + unsigned color_base; /* base register for chosen COLOR/BCOLOR's */ + int face_idx; + unsigned numtmp; + struct { + struct tgsi_full_src_register src; + struct tgsi_full_dst_register dst; + } tmp[2]; +#define A 0 +#define B 1 + struct tgsi_full_src_register imm; + int emitted_decls; + unsigned saturate; +}; + +static inline struct fd_lowering_context * +fd_lowering_context(struct tgsi_transform_context *tctx) +{ + return (struct fd_lowering_context *)tctx; +} + +/* + * Utility helpers: + */ + +static void +reg_dst(struct tgsi_full_dst_register *dst, + const struct tgsi_full_dst_register *orig_dst, unsigned wrmask) +{ + *dst = *orig_dst; + dst->Register.WriteMask &= wrmask; + assert(dst->Register.WriteMask); +} + +static inline void +get_swiz(unsigned *swiz, const struct tgsi_src_register *src) +{ + swiz[0] = src->SwizzleX; + swiz[1] = src->SwizzleY; + swiz[2] = src->SwizzleZ; + swiz[3] = src->SwizzleW; +} + +static void +reg_src(struct tgsi_full_src_register *src, + const struct tgsi_full_src_register *orig_src, + unsigned sx, unsigned sy, unsigned sz, unsigned sw) +{ + unsigned swiz[4]; + get_swiz(swiz, &orig_src->Register); + *src = *orig_src; + src->Register.SwizzleX = swiz[sx]; + src->Register.SwizzleY = swiz[sy]; + src->Register.SwizzleZ = swiz[sz]; + src->Register.SwizzleW = swiz[sw]; +} + +#define TGSI_SWIZZLE__ TGSI_SWIZZLE_X /* don't-care value! */ +#define SWIZ(x,y,z,w) TGSI_SWIZZLE_ ## x, TGSI_SWIZZLE_ ## y, \ + TGSI_SWIZZLE_ ## z, TGSI_SWIZZLE_ ## w + +/* + * if (dst.x aliases src.x) { + * MOV tmpA.x, src.x + * src = tmpA + * } + * COS dst.x, src.x + * SIN dst.y, src.x + * MOV dst.zw, imm{0.0, 1.0} + */ +static bool +aliases(const struct tgsi_full_dst_register *dst, unsigned dst_mask, + const struct tgsi_full_src_register *src, unsigned src_mask) +{ + if ((dst->Register.File == src->Register.File) && + (dst->Register.Index == src->Register.Index)) { + unsigned i, actual_mask = 0; + unsigned swiz[4]; + get_swiz(swiz, &src->Register); + for (i = 0; i < 4; i++) + if (src_mask & (1 << i)) + actual_mask |= (1 << swiz[i]); + if (actual_mask & dst_mask) + return true; + } + return false; +} + +static void +create_mov(struct tgsi_transform_context *tctx, +
[Mesa-dev] [PATCH 3/3] gallium: Rename freedreno parts of tgsi_lowering.[ch].
--- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/tgsi/tgsi_lowering.c | 50 +++--- src/gallium/auxiliary/tgsi/tgsi_lowering.h | 12 +++ 3 files changed, 32 insertions(+), 31 deletions(-) diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 58d8af7..f6621ef 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -75,6 +75,7 @@ C_SOURCES := \ tgsi/tgsi_exec.c \ tgsi/tgsi_info.c \ tgsi/tgsi_iterate.c \ + tgsi/tgsi_lowering.c \ tgsi/tgsi_parse.c \ tgsi/tgsi_sanity.c \ tgsi/tgsi_scan.c \ diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c b/src/gallium/auxiliary/tgsi/tgsi_lowering.c index 5627bb5..b6b18db 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_lowering.c +++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c @@ -31,11 +31,11 @@ #include "util/u_debug.h" #include "util/u_math.h" -#include "freedreno_lowering.h" +#include "tgsi_lowering.h" -struct fd_lowering_context { +struct tgsi_lowering_context { struct tgsi_transform_context base; - const struct fd_lowering_config *config; + const struct tgsi_lowering_config *config; struct tgsi_shader_info *info; unsigned two_side_colors; unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS]; @@ -53,10 +53,10 @@ struct fd_lowering_context { unsigned saturate; }; -static inline struct fd_lowering_context * -fd_lowering_context(struct tgsi_transform_context *tctx) +static inline struct tgsi_lowering_context * +tgsi_lowering_context(struct tgsi_transform_context *tctx) { - return (struct fd_lowering_context *)tctx; + return (struct tgsi_lowering_context *)tctx; } /* @@ -196,7 +196,7 @@ static void transform_dst(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src0 = &inst->Src[0]; struct tgsi_full_src_register *src1 = &inst->Src[1]; @@ -276,7 +276,7 @@ static void transform_xpd(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src0 = &inst->Src[0]; struct tgsi_full_src_register *src1 = &inst->Src[1]; @@ -347,7 +347,7 @@ static void transform_scs(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src = &inst->Src[0]; struct tgsi_full_instruction new_inst; @@ -409,7 +409,7 @@ static void transform_lrp(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src0 = &inst->Src[0]; struct tgsi_full_src_register *src1 = &inst->Src[1]; @@ -475,7 +475,7 @@ static void transform_frc(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src = &inst->Src[0]; struct tgsi_full_instruction new_inst; @@ -519,7 +519,7 @@ static void transform_pow(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src0 = &inst->Src[0]; struct tgsi_full_src_register *src1 = &inst->Src[1]; @@ -579,7 +579,7 @@ static void transform_lit(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowering_context(tctx); + struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx); struct tgsi_full_dst_register *dst = &inst->Dst[0]; struct tgsi_full_src_register *src = &inst->Src[0]; struct tgsi_full_instruction new_inst; @@ -690,7 +690,7 @@ static void transform_exp(struct tgsi_transform_context *tctx, struct tgsi_full_instruction *inst) { - struct fd_lowering_context *ctx = fd_lowe