Re: [Mesa-dev] [PATCH] i965/cs: Use exec all for CS terminate

2015-06-12 Thread Francisco Jerez
Jordan Justen writes: > This prevents an assertion from being hit with SIMD16: > > Assertion `inst->exec_size == dispatch_width() || force_writemask_all' failed. > > Signed-off-by: Jordan Justen > Cc: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-23 Thread Francisco Jerez
Jason Ekstrand writes: > We want to move these into the builder so that they know the current > builder's dispatch width. This will be needed by a later commit. I very much like the idea of this series, but, why do you need to move these register manipulators into the builder? The builder is a

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> We want to move these into the builder so that they know the current >>> builder's dispatch width. This will be needed by a later c

Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-06-24 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote: >> I wanted to question whether this was required, based on this text >> from the extension spec: >> >> "The ability to write to buffer objects creates the potential for >> multiple independent shader invocations to re

Re: [Mesa-dev] [PATCH] clover: Implement image attribute getters

2015-06-24 Thread Francisco Jerez
Zoltan Gilian writes: > Image attributes are passed to the kernel as hidden parameters after the > image attribute itself. An llvm pass replaces the getter builtins to > the appropriate parameters. This seems to be doing essentially the same thing as v1? Is it the right patch? > --- > src/gal

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Jun 24, 2015 4:29 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez > wrote: >> >> Jason Ekstrand writes: >> >> >>

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Jun 24, 2015 6:29 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > On Jun 24, 2015 4:29 AM, "Francisco Jerez" > wrote: >> >> >> >> Jason Ekstrand writes: >> &g

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Wed, Jun 24, 2015 at 6:44 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Jun 24, 2015 6:29 AM, "Francisco Jerez" wrote: >>>> >>>> Jason Ekstrand writes: >>>> >>&g

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Davin McCall writes: > On 26/06/15 11:08, Erik Faye-Lund wrote: >> On Thu, Jun 25, 2015 at 1:48 AM, Davin McCall wrote: >>> This is an alternative to my earlier patch [1] (and it is now constructed >>> properly using git format-patch). >>> >>> Quick background: >>> There is a problem in exec_lis

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Davin McCall writes: > On 26/06/15 13:18, Francisco Jerez wrote: >> Davin McCall writes: >> >>> On 26/06/15 11:08, Erik Faye-Lund wrote: >>>> On Thu, Jun 25, 2015 at 1:48 AM, Davin McCall wrote: >>>>> This is an alternative to my earlier

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Davin McCall writes: > On 26/06/15 14:31, Eirik Byrkjeflot Anonsen wrote: >> Erik Faye-Lund writes: >> >>> On Fri, Jun 26, 2015 at 1:23 PM, Davin McCall wrote: On 26/06/15 12:03, Davin McCall wrote: > ... The stored value of 'n' is not accessed by any other type than the > type of

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Erik Faye-Lund writes: > On Fri, Jun 26, 2015 at 4:16 PM, Davin McCall wrote: >> On 26/06/15 14:53, Erik Faye-Lund wrote: >>> >>> On Fri, Jun 26, 2015 at 3:05 PM, Davin McCall wrote: On 26/06/15 12:55, Erik Faye-Lund wrote: On Fri, Jun 26, 2015 at 1:23 PM, Davin McCall wrot

Re: [Mesa-dev] [PATCH v2 00/19] i965/fs: Remove the width field from fs_reg

2015-06-26 Thread Francisco Jerez
y we want now. > > 08: New. It's just moving code around so it should be trivial. > > 09: New. This is a complete replacement of patch 07 from the previous > series. > > Cc: Topi Pohjolainen > Cc: Iago Toral Quiroga > Cc: Francisco Jerez > Cc: Neil Ro

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Erik Faye-Lund writes: > On Fri, Jun 26, 2015 at 4:53 PM, Francisco Jerez > wrote: >> Erik Faye-Lund writes: >> >>> On Fri, Jun 26, 2015 at 4:16 PM, Davin McCall wrote: >>>> On 26/06/15 14:53, Erik Faye-Lund wrote: >>>>> >

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-26 Thread Francisco Jerez
Erik Faye-Lund writes: > On Fri, Jun 26, 2015 at 4:01 PM, Francisco Jerez > wrote: >> Davin McCall writes: >> >>> On 26/06/15 14:31, Eirik Byrkjeflot Anonsen wrote: >>>> Erik Faye-Lund writes: >>>> >>>>> On Fri, Jun 26, 201

Re: [Mesa-dev] [PATCH v2 16/19] i965/fs: Use the builder dispatch_width for computing register offsets

2015-06-26 Thread Francisco Jerez
delta * MAX2(reg.width * reg.stride, 1) * > + delta * bld.dispatch_width() * reg.stride * Er... This doesn't look right for stride == 0. If you keep the MAX2(.., 1) expression this patch is: Reviewed-by: Francisco Jerez > type_sz(r

Re: [Mesa-dev] [PATCH 2/2] clover: implement CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

2015-06-26 Thread Francisco Jerez
g to a kernel's >> resource usage, but that's a possible optimization for the future. > > Ping? > > This is rather simple, but I'd like an Rb, if possible. That also goes > for the Gallium support patch. > For this patch: Reviewed-by: Francisco Jerez Tha

Re: [Mesa-dev] [PATCH 1/2] clover: fix event handling of buffer operations

2015-06-26 Thread Francisco Jerez
Grigori Goronzy writes: > On 2015-06-09 22:52, Francisco Jerez wrote: >>> + >>> + if (blocking) >>> + hev().wait(); >>> + >> >> hard_event::wait() may fail, so this should probably be done before the >> ret_object() call to a

Re: [Mesa-dev] [PATCH v2 16/19] i965/fs: Use the builder dispatch_width for computing register offsets

2015-06-26 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jun 26, 2015 at 8:52 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> Reviewed-by: Topi Pohjolainen >>> --- >>> src/mesa/drivers/dri/i965/brw_fs.h | 2 +- >>> 1 file changed, 1 insertion(+

Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Francisco Jerez
Jason Ekstrand writes: > In C, if you partially initialize a structure, the rest of the struct gets > set to 0. C++, however, does not have this rule so GCC throws warnings > whenver NIR_SRC_INIT or NIR_DEST_INIT is used in C++. I don't think that's right, in C++ initializers missing from an ag

Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jun 26, 2015 at 12:08 PM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> In C, if you partially initialize a structure, the rest of the struct gets >>> set to 0. C++, however, does not have this rule so GCC thro

Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jun 26, 2015 at 3:03 PM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Fri, Jun 26, 2015 at 12:08 PM, Francisco Jerez >>> wrote: >>>> Jason Ekstrand writes: >>>> >>>>>

Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jun 26, 2015 at 3:34 PM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Fri, Jun 26, 2015 at 3:03 PM, Francisco Jerez >>> wrote: >>>> Jason Ekstrand writes: >>>> >>&

Re: [Mesa-dev] [PATCH 1/2] gallium: add PIPE_COMPUTE_CAP_SUBGROUP_SIZE

2015-06-27 Thread Francisco Jerez
Grigori Goronzy writes: > We need this to implement OpenCL's > CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. Reviewed-by: Francisco Jerez Thanks. > --- > src/gallium/docs/source/screen.rst | 2 ++ > src/gallium/drivers/ilo/ilo_screen.c | 8 +++

Re: [Mesa-dev] [PATCH 0/3] additions to loop unroll patchset

2015-06-29 Thread Francisco Jerez
ves/mesa-dev/2015-June/086049.html > > All patches applied on master: > http://cgit.freedesktop.org/~tpalli/mesa/log/?h=unroll_loops > Looks good to me, for the series: Reviewed-by: Francisco Jerez > Thanks; > > Tapani Pälli (3): > i965: use EmitNoIndirectSampler for ge

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-29 Thread Francisco Jerez
Davin McCall writes: > On 26/06/15 14:53, Francisco Jerez wrote: > >> [...] >> >> Your first approach seemed quite reasonable IMHO. Were you able to >> measure any performance regression from it? >> >> Thanks. >> > > Wh

Re: [Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

2015-06-29 Thread Francisco Jerez
Davin McCall writes: > On 29/06/15 10:40, Francisco Jerez wrote: >> Davin McCall writes: >> >>> On 26/06/15 14:53, Francisco Jerez wrote: >>> >>>> [...] >>>> >>>> Your first approach seemed quite reasonable IM

Re: [Mesa-dev] [PATCH v2 04/19] i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/Y

2015-06-29 Thread Francisco Jerez
Jason Ekstrand writes: > Reviewed-by: Iago Toral Quiroga > Reviewed-by: Topi Pohjolainen > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 589b74c..6

Re: [Mesa-dev] [PATCH v2 12/19] i965/fs: Use exec_size for determining regs read/written and partial writes

2015-06-30 Thread Francisco Jerez
Jason Ekstrand writes: > Reviewed-by: Topi Pohjolainen > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index d1e253a..4f56865 100644 > -

Re: [Mesa-dev] [PATCH v2 09/19] i965/fs: Add a builder argument to offset()

2015-06-30 Thread Francisco Jerez
Jason Ekstrand writes: > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 42 > src/mesa/drivers/dri/i965/brw_fs.h | 2 +- > src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 58 +-- > src/mesa/drivers/dri/i

Re: [Mesa-dev] [PATCH v2 05/19] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup

2015-06-30 Thread Francisco Jerez
this->pixel_y = vgrf(glsl_type::float_type); > -- > 2.4.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev From 09f6cb08cd9951d8618dea7360aa7619cc80698

[Mesa-dev] [PATCH] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-06-30 Thread Francisco Jerez
Instead of relying on hardware defaults the i915 kernel driver is going program custom MOCS tables system-wide on Gen9 hardware. The "WT" entry previously used for renderbuffers had a number of problems: It disabled caching on eLLC, it used a reserved L3 cacheability setting, and it used to overri

Re: [Mesa-dev] [PATCH] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-06-30 Thread Francisco Jerez
Ben Widawsky writes: > On Tue, Jun 30, 2015 at 11:25:42PM +0300, Francisco Jerez wrote: >> Instead of relying on hardware defaults the i915 kernel driver is >> going program custom MOCS tables system-wide on Gen9 hardware. The >> "WT" entry previously used fo

Re: [Mesa-dev] [PATCH] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-06-30 Thread Francisco Jerez
Ben Widawsky writes: > On Wed, Jul 01, 2015 at 12:33:54AM +0300, Francisco Jerez wrote: >> Ben Widawsky writes: >> >> > On Tue, Jun 30, 2015 at 11:25:42PM +0300, Francisco Jerez wrote: >> >> Instead of relying on hardware defaults the i915 kernel drive

Re: [Mesa-dev] [PATCH 2/2] i965/fs: Use the builder directly for the gen6 interpolation add(32)

2015-07-01 Thread Francisco Jerez
abld.exec_all().group(dispatch_width * 2, 0); The abld32 name seems misleading because this can actually be a 16 or 32 wide builder depending on dispatch_width. I suggest "dbld" (d for double), or just expand the definition in its only user and get rid of the temporary. With that fixed: Re

[Mesa-dev] [PATCHv0.5] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-07-01 Thread Francisco Jerez
Instead of relying on hardware defaults the i915 kernel driver is going program custom MOCS tables system-wide on Gen9 hardware. The "WT" entry previously used for renderbuffers had a number of problems: It disabled caching on eLLC, it used a reserved L3 cacheability setting, and it used to overri

[Mesa-dev] [PATCH] i965/gen9: Use custom MOCS entries set up by the kernel on BXT.

2015-07-01 Thread Francisco Jerez
Follow-up to "i965/gen9: Use custom MOCS entries set up by the kernel.", sent as a separate patch to make the SKL change easier to back-port to stable branches. --- This change depends on Ville's "[PATCH 1/2] i965: House MOCS settings in brw_context/brw_device_info": http://lists.freedesktop.org/a

Re: [Mesa-dev] [PATCH v2 16/19] i965/fs: Use the builder dispatch_width for computing register offsets

2015-07-01 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jun 26, 2015 at 11:51 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Fri, Jun 26, 2015 at 8:52 AM, Francisco Jerez >>> wrote: >>>> Jason Ekstrand writes: >>>> >>>>>

Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-07-03 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On 29/06/15 09:11, Jordan Justen wrote: >> On 2015-06-24 07:36:24, Iago Toral wrote: >>> On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote: >>>> AFAICT the reason why this (and many of the other changes in GLSL >>

Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-03 Thread Francisco Jerez
Neil Roberts writes: > There was a comment saying that in SIMD16 mode the pixel interpolator > returns coords interleaved 8 channels at a time and that this requires > extra work to support. However, this interleaved format is exactly > what the PLN instruction requires so I don't think anything

Re: [Mesa-dev] [PATCH 1/3] clover: separate compile and link stages

2015-07-05 Thread Francisco Jerez
Hi EdB, a bunch of comments inline, EdB writes: > --- > src/gallium/state_trackers/clover/api/program.cpp | 6 +- > .../state_trackers/clover/core/compiler.hpp| 7 +- > src/gallium/state_trackers/clover/core/error.hpp | 21 ++ > src/gallium/state_trackers/clover/core/program.cpp

Re: [Mesa-dev] [PATCH 2/3] clover: override ret_object

2015-07-05 Thread Francisco Jerez
How about "[...] from an intrusive reference to a Clover object [...]"? With that fixed: Reviewed-by: Francisco Jerez > + template > + typename T::descriptor_type * > + ret_object(const intrusive_ref &v) { > + v().retain(); > + return desc

Re: [Mesa-dev] [PATCH 1/3] clover: separate compile and link stages

2015-07-05 Thread Francisco Jerez
EdB writes: > On Sunday 05 July 2015 18:15:33 Francisco Jerez wrote: >>[...] >> > --- a/src/gallium/state_trackers/clover/core/error.hpp >> > +++ b/src/gallium/state_trackers/clover/core/error.hpp >> > @@ -68,10 +68,31 @@ namespace clover { >> &g

Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-05 Thread Francisco Jerez
Hi Matt, Matt Turner writes: > On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez wrote: >> Heh, I happened to come across this comment yesterday while looking for >> the remaining no16 calls and wondered why on earth it couldn't do the >> same that the normal interpolat

Re: [Mesa-dev] [PATCH] clover: Implement image attribute getters

2015-07-06 Thread Francisco Jerez
ribed in my reply to v1, it would be acceptable to implement it for the time being using a workaround similar to llvm/invocation.cpp:433 -- Hint: you'll need new module::argument::semantic enums. Thanks. > On Wed, Jun 24, 2015 at 2:48 PM, Francisco Jerez > wrote: >> Zoltan Gilian

[Mesa-dev] [PATCH 1/3] i965/gen4-5: Set ENDIF dst and src0 fields to the null register.

2015-07-06 Thread Francisco Jerez
The hardware docs don't mention explicitly what these fields should be, but I've verified experimentally on ILK that using a GRF as destination causes the register to be corrupted when the execution size of an ENDIF instruction is higher than 8 -- and because the destination we were using was g0, e

[Mesa-dev] [PATCH 2/3] i965/gen4-5: Program the execution size correctly for DO/WHILE instructions.

2015-07-06 Thread Francisco Jerez
From the hardware docs for the DO instruction: "Execution size is ignored for this instruction." My observation on ILK hardware contradicts the spec though, channels over the execution size of a DO instruction won't enter the loop, and channels over the execution size of a WHILE instruction will

[Mesa-dev] [PATCH 3/3] i965/gen4-5: Enable 16-wide dispatch on shaders with control flow.

2015-07-06 Thread Francisco Jerez
This was probably disabled due to a combination of several bugs in the generator code (fixed earlier in this series) and a misunderstanding of the hardware spec. The documentation for most control flow instructions mentions among other restrictions: "Instruction compression is not allowed." Thi

Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-07 Thread Francisco Jerez
Matt Turner writes: > On Sun, Jul 5, 2015 at 4:45 PM, Francisco Jerez wrote: >> Hi Matt, >> >> Matt Turner writes: >> >>> On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez >>> wrote: >>>> Heh, I happened to come across this comment yest

Re: [Mesa-dev] [PATCH 0/8] Render node only opencl and pipe-loader cleanups

2015-07-07 Thread Francisco Jerez
ipe_loader_sw_probe_xlib) to using loader_open_device() over >> open(), with the former caring about CLOEXEC. >> > Francisco, Tom, > > Can you guys please take a look at the series. Even an Ack would be > greatly appreciated. > Looks OK to me, assuming that Tom is OK with th

[Mesa-dev] [PATCHv2] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-07-07 Thread Francisco Jerez
Instead of relying on hardware defaults the i915 kernel driver is going program custom MOCS tables system-wide on Gen9 hardware. The "WT" entry previously used for renderbuffers had a number of problems: It disabled caching on eLLC, it used a reserved L3 cacheability setting, and it used to overri

Re: [Mesa-dev] [Mesa-stable] [PATCHv2] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-07-09 Thread Francisco Jerez
Ben Widawsky writes: > On Tue, Jul 07, 2015 at 10:21:28PM +0300, Francisco Jerez wrote: >> Instead of relying on hardware defaults the i915 kernel driver is >> going program custom MOCS tables system-wide on Gen9 hardware. The >> "WT" entry previously used fo

[Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-09 Thread Francisco Jerez
We were passing src0 alpha and oMask in reverse order. There seems to be no good way to pass them in the correct order to the new-style LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while oMask is not. Just split src0 alpha in fixed-width registers and pass them to LOAD_PAYLOAD

[Mesa-dev] [HACK] i965/fs: Fix rescale_texcoord() for SIMD16 and remove no16 fall-back.

2015-07-09 Thread Francisco Jerez
Aside from the trivial GRF underallocation problem in the "devinfo->gen < 6 && is_rect" if-block, the texrect scale uniform look-up code was assuming a one-to-one mapping between UNIFORM register indices and the param array, which only holds during the SIMD8 run. It seems dubious that this needs t

[Mesa-dev] [PATCH] i965/fs: Reimplement nir_op_uadd_carry and _usub_borrow without accumulator.

2015-07-09 Thread Francisco Jerez
This gets rid of two no16() fall-backs and should allow better scheduling of the generated IR. There are no uses of usubBorrow() or uaddCarry() in shader-db so no changes are expected. However the "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and "arb_gpu_shader5/execution/built-in

Re: [Mesa-dev] [PATCH] i965/fs: Reimplement nir_op_uadd_carry and _usub_borrow without accumulator.

2015-07-09 Thread Francisco Jerez
pproach tomorrow. > On Thu, Jul 9, 2015 at 3:51 PM, Francisco Jerez wrote: >> This gets rid of two no16() fall-backs and should allow better >> scheduling of the generated IR. There are no uses of usubBorrow() or >> uaddCarry() in shader-db so no changes are expected. Ho

Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Francisco Jerez
Jason Ekstrand writes: > On Jul 9, 2015 7:57 AM, "Francisco Jerez" wrote: >> >> We were passing src0 alpha and oMask in reverse order. There seems to >> be no good way to pass them in the correct order to the new-style >> LOAD_PAYLOAD (how surprising) be

Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Jul 9, 2015 7:57 AM, "Francisco Jerez" wrote: >>>> >>>> We were passing src0 alpha and oMask in reverse o

[Mesa-dev] [PATCH 1/2] i965: Implement b2f and b2i using negation.

2015-07-10 Thread Francisco Jerez
Booleans are represented as 0/-1 on modern hardware which means we can just negate them to convert them into a numeric type. Negation has the benefit that it can be implemented using a source modifier which can easily be propagated into some other instruction. shader-db results on HSW: total in

[Mesa-dev] [PATCHv2 2/2] i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator.

2015-07-10 Thread Francisco Jerez
This gets rid of two no16() fall-backs and should allow better scheduling of the generated IR. There are no uses of usubBorrow() or uaddCarry() in shader-db so no changes are expected. However the "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and "arb_gpu_shader5/execution/built-in

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-10 Thread Francisco Jerez
Zoltan Gilian writes: > Read-only and write-only image arguments are recognized and > distinguished. > Attributes of the image arguments are passed to the kernel as implicit > arguments. Thanks, this looks much better. One thing that still seems kind of unfortunate is the fact that you've added

Re: [Mesa-dev] [PATCH 1/2] i965: Implement b2f and b2i using negation.

2015-07-10 Thread Francisco Jerez
Matt Turner writes: > On Fri, Jul 10, 2015 at 10:06 AM, Francisco Jerez > wrote: >> Booleans are represented as 0/-1 on modern hardware which means we can >> just negate them to convert them into a numeric type. Negation has >> the benefit that it can be implemented

Re: [Mesa-dev] [PATCH 2/2] clover: Use threadsafe wrappers for pipe_context v2

2015-07-11 Thread Francisco Jerez
eads. > > v2: > - Don't use wrapper for pipe_screen. > > CC: 10.6 Thanks, this patch is: Reviewed-by: Francisco Jerez > --- > src/gallium/state_trackers/clover/core/queue.cpp | 2 ++ > src/gallium/targets/opencl/Makefile.am | 4 +++- > 2 files changed, 5 i

Re: [Mesa-dev] [PATCH] clover: Fix bug with computing hard_event status

2015-07-11 Thread Francisco Jerez
Tom Stellard writes: > pipe_context::flush() can return a NULL fence if the queue is already > empty, so we should not assume that an event with a NULL fence > has the status of CL_QUEUED. > This seems suspicious... On the one hand it doesn't seem to be a documented "feature" of pipe_context::f

Re: [Mesa-dev] [PATCH] i965/fs: Make the texturing helpers take NIR opcodes instead of old IR ones

2015-07-13 Thread Francisco Jerez
Jason Ekstrand writes: > Now that the old GLSL IR visitor code is gone, having the remap is silly. > --- > src/mesa/drivers/dri/i965/brw_fs.h | 12 +-- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 18 +--- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 > ++

Re: [Mesa-dev] [PATCH 2/5] i965/fs: fix stride and type for hw_reg's in regs_read()

2015-07-14 Thread Francisco Jerez
> } > > bool > -- > 2.4.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev From f09181eadd3ff1cd10f1afeee13e6c4bb86caa91 Mon Sep 17 00:00:00 2001 Fr

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-14 Thread Francisco Jerez
; input buffer) objectionable? Do you have any suggestions on how to > overcome this problem, so the metadata could be passed interleaved? > > On Fri, Jul 10, 2015 at 8:08 PM, Francisco Jerez > wrote: >> Zoltan Gilian writes: >> >>> Read-only and write-only image a

Re: [Mesa-dev] [PATCH 2/5] i965/fs: fix stride and type for hw_reg's in regs_read()

2015-07-15 Thread Francisco Jerez
Connor Abbott writes: > On Tue, Jul 14, 2015 at 6:02 AM, Francisco Jerez > wrote: >> Connor Abbott writes: >> >>> sources with file == HW_REG get all their information from the >>> fixed_hw_reg field, so we need to get the stride and type from there >>

[Mesa-dev] [PATCH 1/3] i965/fs: Fix stride for immediate registers.

2015-07-16 Thread Francisco Jerez
When the width field was removed from fs_reg the BROADCAST handling code in opt_algebraic() started to miss a number of trivial optimization cases resulting in the ugly indirect-addressing sequence to be emitted unnecessarily for some variable-indexed texturing and UBO loads regardless of one of th

[Mesa-dev] [PATCH 3/3] i965: Fix stride field for the result of emit_uniformize().

2015-07-16 Thread Francisco Jerez
This is essentially the same problem fixed in an earlier patch for immediates. Setting the stride to zero will be particularly useful for my future SIMD lowering pass, because we will be able to just check whether the stride of a source register is zero and skip emitting the copies required to unz

[Mesa-dev] [PATCH 2/3] i965/fs: Fix stride field for uniforms.

2015-07-16 Thread Francisco Jerez
This fixes essentially the same problem as for immediates. Registers of the UNIFORM file are typically accessed according to the formula: read_uniform(r, channel_index, array_index) = read_element(r, channel_index * 0 + array_index * 1) Which matches the general direct addressing formula fo

[Mesa-dev] [PATCH 1/4] i965/fs: Add stub lowering pass for logical send-message opcodes.

2015-07-16 Thread Francisco Jerez
This pass will house ad-hoc lowering code for several send message-like virtual opcodes that will represent their logically independent arguments as separate instruction sources rather than as a single payload blob. This pass will basically just take the separate arguments that are supposed to be

[Mesa-dev] [PATCH 2/4] i965/fs: Add builder emit method taking a variable number of source registers.

2015-07-16 Thread Francisco Jerez
And start using it in fs_builder::LOAD_PAYLOAD(). This will be used to emit logical send message opcodes which have an unusually large number of arguments. --- src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/

[Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.

2015-07-16 Thread Francisco Jerez
This lowering pass implements an algorithm to expand SIMDN instructions into a sequence of SIMDM instructions in cases where the hardware doesn't support the original execution size natively for some particular instruction. The most important use-cases are: - Lowering send message instructions t

[Mesa-dev] [PATCH 3/4] i965/fs: Fix return value of fs_inst::regs_read() for BAD_FILE.

2015-07-16 Thread Francisco Jerez
Typically BAD_FILE sources are used to mark a source as not present what implies that no registers are read. This will become much more frequent with logical send opcodes which have a large number of sources, many of them optionally used and marked as BAD_FILE when they aren't applicable. It will

[Mesa-dev] [PATCH 04/12] i965/fs: Fix slight layering violation in emit_single_fb_writes().

2015-07-16 Thread Francisco Jerez
In cases where the color0 argument wasn't being provided, emit_single_fb_writes() would take the alpha channel directly from the visitor state instead of taking it from its arguments. This sort of hack didn't fit nicely into the logical send-message approach because all parameters of the instructi

[Mesa-dev] [PATCH 06/12] i965/fs: Move up prog_data->uses_omask assignment up to brw_codegen_wm_prog().

2015-07-16 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 -- src/mesa/drivers/dri/i965/brw_wm.c | 3 ++- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 08d9abf..c489010 100644 -

[Mesa-dev] [PATCH 02/12] i965/fs: Honour the instruction force_sechalf and exec_size fields for FB writes.

2015-07-16 Thread Francisco Jerez
We were previously guessing the half based on the EOT flag which seems rather gross. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generat

[Mesa-dev] [PATCH 01/12] i965/fs: Define logical framebuffer write opcode.

2015-07-16 Thread Francisco Jerez
The logical variant is largely equivalent to the original opcode but instead of taking a single payload source it expects the arguments that make up the payload separately as individual sources, like: fb_write_logical null, color0, color1, src0_alpha, src_depth, dst_depth,

[Mesa-dev] [PATCH 03/12] i965/fs: Make sure that the type sizes are compatible during copy propagation.

2015-07-16 Thread Francisco Jerez
It's surprising that we weren't checking for this already. A future patch will cause code like the following to be emitted: MOV(16) tmp<1>:uw, src MOV(8) dst<1>:ud, tmp<8,8,1>:ud The second MOV comes from the expansion of a LOAD_PAYLOAD header copy, so I don't have control over its types. Cop

[Mesa-dev] [PATCH 09/12] i965/fs: Remove the FS_OPCODE_SET_OMASK pseudo-opcode.

2015-07-16 Thread Francisco Jerez
This is now unused. --- src/mesa/drivers/dri/i965/brw_defines.h| 1 - src/mesa/drivers/dri/i965/brw_fs.h | 4 --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 35 -- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 -- 4 files changed, 42 deleti

[Mesa-dev] [PATCH 10/12] i965/fs: Hook up SIMD lowering to unroll FB writes of unsupported width.

2015-07-16 Thread Francisco Jerez
This shouldn't have any effect because we don't emit logical framebuffer writes yet. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index ae050b7..70fdc5e 100644 ---

[Mesa-dev] [PATCH 08/12] i965/fs: Don't attempt to copy the useless half of oMask for SIMD8 FB writes.

2015-07-16 Thread Francisco Jerez
There's no need to initialize the wrong half of oMask in the payload when we're doing an 8-wide framebuffer write because it will be ignored by the hardware anyway. By doing it this way we can let the SIMD lowering pass split the sample_mask source as a regular per-channel source, otherwise we wou

[Mesa-dev] [PATCH 11/12] i965/fs: Implement lowering of logical framebuffer writes.

2015-07-16 Thread Francisco Jerez
This does essentially the same thing as fs_visitor::emit_single_fb_write(), with some slight differences: - We don't have to worry about exec_size and use_2nd_half anymore, 16-wide sources have already been lowered to 8-wide thanks to the previous commit and the manual argument unzipping is

[Mesa-dev] [PATCH 05/12] i965/fs: Simplify control flow in emit_single_fb_write().

2015-07-16 Thread Francisco Jerez
Flatten the if ladder to match the way that the ordering of these fields is specified in the hardware documentation a bit more closely. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 28 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/src/mesa/drive

[Mesa-dev] [PATCH 12/12] i965/fs: Reimplement emit_single_fb_write() in terms of logical framebuffer writes.

2015-07-16 Thread Francisco Jerez
The only non-trivial thing it still has to do is figure out where to take the src/dst depth values from and predicate the instruction if discard is in use. The manual SIMD unrolling logic in the dual-source case goes away because this is now handled transparently by the SIMD lowering pass. --- sr

[Mesa-dev] [PATCH 07/12] i965/fs: Move up Gen6 no16 check to emit_fb_writes().

2015-07-16 Thread Francisco Jerez
And update the comment. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index c489010..b5a42b1 100644 --- a/src/me

Re: [Mesa-dev] [PATCH 1/3] i965/fs: Fix stride for immediate registers.

2015-07-17 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On 16/07/15 17:33, Francisco Jerez wrote: >> When the width field was removed from fs_reg the BROADCAST handling >> code in opt_algebraic() started to miss a number of trivial >> optimization cases resulting in the ugly indirect-address

Re: [Mesa-dev] [PATCH 1/3] i965/fs: Fix stride for immediate registers.

2015-07-17 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On Fri, 2015-07-17 at 16:33 +0300, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >> > On 16/07/15 17:33, Francisco Jerez wrote: >> >> When the width field was removed from fs_reg the BROADCAST handling &

Re: [Mesa-dev] [PATCH] radeonsi: don't return NULL fence if no fence is available

2015-07-18 Thread Francisco Jerez
Michel Dänzer writes: > On 17.07.2015 06:03, Marek Olšák wrote: >> From: Marek Olšák >> >> An alternative (and ugly) solution to the current clover issue. > > How about something like this instead? (Compile tested only) > I'm rather unfamiliar with the radeonsi pipe driver code so I should pro

[Mesa-dev] [PATCH 02/12] i965/fs: Use exec_size instead of dispatch_width to determine the message variant.

2015-07-18 Thread Francisco Jerez
dispatch_width is global for a single compilation and doesn't necessarily match the desired execution width if we had to lower the original full-width instruction due to hardware limitations. These were all inside a Gen4-specific branch so this patch shouldn't have any effect on more recent hardwa

[Mesa-dev] [PATCH 03/12] i965/fs: Fix opt_zero_samples() for texturing ops not matching dispatch_width.

2015-07-18 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 6afb9fe..c31a0e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/b

[Mesa-dev] [PATCH 07/12] i965/fs: Implement lowering of logical texturing opcodes on Gen5-6.

2015-07-18 Thread Francisco Jerez
This should be largely equivalent to emit_texture_gen5() except for slight codestyle changes and the use i965 opcodes instead of the ir_texture_opcode enum, see "i965/fs: Implement lowering of logical texturing opcodes on Gen7+." for the mapping between them. --- src/mesa/drivers/dri/i965/brw_fs.c

[Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.

2015-07-18 Thread Francisco Jerez
This should match the set of cases in which we currently call fail() or no16() from the emit_texture_*() methods and the ones in which emit_texture_gen4() enables the SIMD16 workaround. Hint for reviewers: It's not a big deal if I happen to have missed some case here, it will just lead to an asser

[Mesa-dev] [PATCH 06/12] i965/fs: Lower SHADER_OPCODE_TXF_UMS/MCS_LOGICAL too on Gen7+.

2015-07-18 Thread Francisco Jerez
These weren't being handled by emit_texture_gen7() but we can easily lower them here for consistency with other texturing opcodes. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/

[Mesa-dev] [PATCH 01/12] i965/fs: Define logical texture sampling opcodes.

2015-07-18 Thread Francisco Jerez
Each logical variant is largely equivalent to the original opcode but instead of taking a single payload source it expects the arguments separately as individual sources, like: tex_logical dst, coordinates, shadow_c, lod, lod2, sample_index, mcs, sampler, offset,

[Mesa-dev] [PATCH 05/12] i965/fs: Implement lowering of logical texturing opcodes on Gen7+.

2015-07-18 Thread Francisco Jerez
This should be largely equivalent to emit_texture_gen7() except that we now get i965 sampling opcodes directly rather than ir_texture_opcode enum values. The mapping is as follows: - ir_tex -> SHADER_OPCODE_TEX - ir_txb -> FS_OPCODE_TXB - ir_txl -> SHADER_OPCODE_TXL - ir_txd -> SHADER_OPCODE_

[Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().

2015-07-18 Thread Francisco Jerez
So that it's left uninitialized by LOAD_PAYLOAD, we only need to reserve space for it in the message since it will be initialized implicitly by the generator. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/driver

[Mesa-dev] [PATCH 08/12] i965/fs: Implement lowering of logical texturing opcodes on Gen4.

2015-07-18 Thread Francisco Jerez
Unlike its Gen5 and Gen7 counterparts this patch isn't a plain refactor of the previous Gen4 texturing code, it's more of a rewrite largely based on emit_texture_gen4_simd16(). The reason is that on the one hand the original emit_texture_gen4() code didn't seem easily fixable to be SIMD width-inva

[Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.

2015-07-18 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 +--- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 89fcc49..4011639 100644 --- a/src/mesa/drivers/dri/

  1   2   3   4   5   6   7   8   9   10   >