[Mesa-dev] [PATCH v2] i965: Allow intel_try_pbo_upload for 3D and array textures

2014-12-22 Thread Neil Roberts
I just realised I made regular cube map textures stop working via the blit path with this patch. Here is a v2 which just adds GL_TEXTURE_CUBE_MAP to the switch in intel_try_pbo_upload. I've tested that it still works with a hacky tweak to the piglit test case. --- >8 --- (use git a

Re: [Mesa-dev] [PATCH 0/3] i965: Use intel_try_pbo_upload for sub updates and 3D textures

2015-01-08 Thread Neil Roberts
er path for Jason if he's already working on that. Regards, - Neil Chris Forbes writes: > Are there some performance numbers to go with this? > > On Tue, Dec 23, 2014 at 12:08 PM, Neil Roberts wrote: >> Here are some patches to make the i965 driver use the blit pipeline >&g

Re: [Mesa-dev] [PATCH 4/5] meta: Add a BlitFramebuffers-based implementation of TexSubImage

2015-01-09 Thread Neil Roberts
This patch looks really good. I have some comments below. Jason Ekstrand writes: > This meta path, designed for use with PBO's, creates a temporary texture > out of the PBO and uses BlitFramebuffers to do the actual texture upload. > --- > src/mesa/Makefile.sources | 1 + >

Re: [Mesa-dev] [PATCH v2 2/2] i965: Use the predicate enable bit for conditional rendering without stalling

2015-01-09 Thread Neil Roberts
Daniel Vetter writes: > Oh, I guess my earlier mail was too late. One issue still is picking > the numbers, since you seem to assume here that ver >= 2 means the > stuff actually works. But like Ken said the cmd parser in upstream > isn't really enabled yet. The patch only enables the predicate

Re: [Mesa-dev] [PATCH 5/5] i965: Use _mesa_meta_TexSubImage for PBO's and cases where the texture is busy

2015-01-09 Thread Neil Roberts
Jason Ekstrand writes: > This improves texture upload performance on the PBO upload test available > at http://www.songho.ca/opengl/gl_pbo.html by 80% for the non-PBO case (due > to avoiding a buffer stall) and 500% for the PBO case. Just for reference, if I run this branch against the little te

Re: [Mesa-dev] [PATCH v2 4/5] meta: Add a BlitFramebuffers-based implementation of TexSubImage

2015-01-13 Thread Neil Roberts
u change that it looks good to me. Reviewed-by: Neil Roberts Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] i965: Use _mesa_meta_TexSubImage for PBO's and cases where the texture is busy

2015-01-13 Thread Neil Roberts
This patch and the rest of the series (apart from the comment for patch 4) look good to me and are Reviewed-by: Neil Roberts - Neil pgpBvaoeRSGjh.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http

[Mesa-dev] [PATCH] format_utils: Use a more precise conversion when decreasing bits

2015-01-14 Thread Neil Roberts
When converting to a format that has fewer bits the previous code was just shifting off the bits. This doesn't provide very accurate results. For example when converting from 8 bits to 5 bits it is equivalent to doing this: x * 32 / 256 This works as if it's taking a value from a range where 256

Re: [Mesa-dev] [PATCH] format_utils: Use a more precise conversion when decreasing bits

2015-01-14 Thread Neil Roberts
Neil Roberts writes: > + assert(src_bits + dst_bits <= sizeof(x) * 8); Erm, actually I didn't realise there were places calling this with dst_bits set to 32, so this isn't going to work. I probably should have waited for Piglit to finish before sending the patch

[Mesa-dev] [PATCH v2] format_utils: Use a more precise conversion when decreasing bits

2015-01-14 Thread Neil Roberts
When converting to a format that has fewer bits the previous code was just shifting off the bits. This doesn't provide very accurate results. For example when converting from 8 bits to 5 bits it is equivalent to doing this: x * 32 / 256 This works as if it's taking a value from a range where 256

Re: [Mesa-dev] [PATCH v2] format_utils: Use a more precise conversion when decreasing bits

2015-01-15 Thread Neil Roberts
Jason Ekstrand writes: > This looks fine to me. We should probably also do this for snorm formats. > I don't care if that's part of this or in a separate patch. > --Jason The snorm formats are a bit more fiddly because the hardware doesn't quite seem to be doing what I'd expect. For example, wh

Re: [Mesa-dev] [PATCH v2] format_utils: Use a more precise conversion when decreasing bits

2015-01-16 Thread Neil Roberts
) = (x >> n) + (x >> 2*n) + ... > > See also > > http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/auxiliary/gallivm/lp_bld_arit.c#n851 > > Jose > > (*) it can be expanded as shifts too, but it wouldn't be worthwhile > > > ___

Re: [Mesa-dev] [mesa-dev][PATCH] Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed.

2015-01-21 Thread Neil Roberts
Marius, the ‘Reviewed-by’ tag should only be added if someone explicitly replies to your patch and says that you can add it with their name. It's supposed to mean that the person is happy for the patch to be pushed to master. I did not do this, I only looked at a previous version of the patch brief

Re: [Mesa-dev] [PATCH v1] Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed.

2015-01-22 Thread Neil Roberts
Hi, The COPY_CLEAN_4V_TYPE_AS_FLOAT still doesn't look right because as the last step it calls COPY_SZ_4V which will copy its float arguments using floating-point registers. It seems the piglit test case is still failing and if I step through with GDB I can see that it is hitting this code and usi

Re: [Mesa-dev] [PATCH v3 01/10] mesa/dd: Add a function for creating a texture from a buffer object

2015-01-22 Thread Neil Roberts
Jason Ekstrand writes: > --- > src/mesa/main/dd.h | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h > index 2f40915..eb30847 100644 > --- a/src/mesa/main/dd.h > +++ b/src/mesa/main/dd.h > @@ -415,6 +415,21 @@ struct dd_function_tabl

Re: [Mesa-dev] [PATCH v3 08/10] i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels

2015-01-22 Thread Neil Roberts
Jason Ekstrand writes: > Since the meta path can do strictly more than the blitter path, we just > remove the blitter path entirely. > --- > src/mesa/drivers/dri/i965/intel_pixel_read.c | 130 > ++- > 1 file changed, 6 insertions(+), 124 deletions(-) > > diff --git a/src

Re: [Mesa-dev] [PATCH v3 08/10] i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels

2015-01-22 Thread Neil Roberts
Jason Ekstrand writes: > diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c > b/src/mesa/drivers/dri/i965/intel_pixel_read.c > index 688a919..a64a5f4 100644 > --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c > +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c > @@ -172,15 +58,11 @@ int

Re: [Mesa-dev] [PATCH v3 09/10] i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage

2015-01-22 Thread Neil Roberts
Jason Ekstrand writes: > - } > + if (_mesa_meta_pbo_GetTexSubImage(ctx, 3, texImage, 0, 0, 0, > + texImage->Width, texImage->Height, > + texImage->Depth, format, type, > + pixels, &ctx-

Re: [Mesa-dev] [PATCH v3 00/10] i965: Use the render pipeline for PBO uploads and

2015-01-22 Thread Neil Roberts
This series looks really good to me. I can confirm it gives a 241% transfer rate increase in that little pboUnpack test on BayTrail. Assuming the minor comments I made are fixed and the v2 patch for the pthread_once thingy is used then the series is: Reviewed-by: Neil Roberts Regards, - Neil

Re: [Mesa-dev] [PATCH v2] meta: Compute correct buffer size with SkipRows/SkipPixels

2015-09-01 Thread Neil Roberts
any data, it would just have the height wrong in the sampler state. Reviewed-by: Neil Roberts Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa/pbo: Handle zero height or depth when validating PBO access

2015-09-01 Thread Neil Roberts
It's legal to call glTexSubImage with zero values for the width, height or depth. Previously this was breaking the PBO access validation because it tries to work out the last pixel accessed by getting the pixel at height-1 and depth-1 which would end up with bogus values. This was causing GL error

Re: [Mesa-dev] [PATCH] mesa/pbo: Handle zero height or depth when validating PBO access

2015-09-02 Thread Neil Roberts
Ian Romanick writes: > It seems like it should be handled in the core, and it looks like > _mesa_tex_sub_image is already doing that. Note the "if (width > 0 && > height > 0 && depth > 0)" check. What is the callstack that gets here > with height or depth as zero? That seems fishy. This funct

Re: [Mesa-dev] [PATCH] mesa/pbo: Handle zero height or depth when validating PBO access

2015-09-02 Thread Neil Roberts
Ilia Mirkin writes: >> - end = _mesa_image_offset(dimensions, pack, width, height, >> - format, type, depth-1, height-1, width); >> + if (depth == 0 || height == 0) > > Why not width == 0 as well? You could probably just do > > return GL_TRUE; > > in that case a

[Mesa-dev] [PATCH v2] mesa/pbo: Handle zero width, height or depth when validating access

2015-09-02 Thread Neil Roberts
It's legal to call glTexSubImage with zero values for the width, height or depth. Previously this was breaking the PBO access validation because it tries to work out the last pixel accessed by getting the pixel at height-1 and depth-1 which would end up with bogus values. This was causing GL error

Re: [Mesa-dev] [PATCH v2] mesa/pbo: Handle zero width, height or depth when validating access

2015-09-03 Thread Neil Roberts
Jason Ekstrand writes: > We can probably just bail higher up in the stack and never call the > driver hook if we have a zero dimension. That would also protect us > from silly zero-dim bugs that may exist. Yes, that already does happen. As mentioned elsewhere in the thread, _mesa_validate_pbo_ac

Re: [Mesa-dev] [PATCH] meta: Always bind the texture

2015-09-10 Thread Neil Roberts
Make sense. Is this the only use of the currentTexUnitSave variable? Could be good to remove it if so. Reviewed-by: Neil Roberts - Neil Ian Romanick writes: > From: Ian Romanick > > We may have been called from glGenerateTextureMipmap with CurrentUnit > still set to 0, so w

[Mesa-dev] [PATCH 00/12] i965: Add support for 16x MSAA on SKL+

2015-09-17 Thread Neil Roberts
I ran the series through Piglit but there are some issues. The texelFetch test appears to be broken for sample counts > 10 and needs this patch to work: http://patchwork.freedesktop.org/patch/59485/ The accuracy tests are failing but I think the problem is just that it is too strict. I've writte

[Mesa-dev] [PATCH 01/12] i965: Handle 16x MSAA in IMS dimension munging code.

2015-09-17 Thread Neil Roberts
From: Kenneth Graunke Signed-off-by: Kenneth Graunke Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965

[Mesa-dev] [PATCH 02/12] i965: Program 16x MSAA sample positions.

2015-09-17 Thread Neil Roberts
This is the standard pattern used by the other 3D graphics API. BDW has slots for these values, but they aren't actually used until SKL. Even though the documentation for BDW says they must be zero, it doesn't seem to cause any harm to program them anyway. The comment above for the 8x sample posi

[Mesa-dev] [PATCH 03/12] i965/fs: Disable SIMD16 when a sampler message would be too long

2015-09-17 Thread Neil Roberts
The maximum message length for a send message is 11. Some of the sampler message types have more than 5 arguments which means when they are doubled to accomodate the SIMD16 register size then the message is too long. This is important for the ld2dms_w message which will be used in a later patch bec

[Mesa-dev] [PATCH 04/12] i965/fs/skl+: Use lcd2dms_w instead of lcd2dms

2015-09-17 Thread Neil Roberts
In order to support 16x MSAA, skl+ has a wider version of lcd2dms that takes two parameters for the MCS data. This patch makes it allocate a register that is twice as big for the MCS data and then always use the wider version. --- src/mesa/drivers/dri/i965/brw_defines.h| 4 src/mesa/

[Mesa-dev] [PATCH 11/12] meta: Support 16x MSAA in the multisample scaled blit shader

2015-09-17 Thread Neil Roberts
I'm not too sure about the expression used to index into sample_map in the shader. It looks like if fract(coord.x) and fract(coord.y) are close to 1.0 then it would index outside of the array. However the code for 4 and 8 has the same problem and the results seems to look reasonable. It might make

[Mesa-dev] [PATCH 10/12] i965/meta: Support 16x MSAA in the meta stencil blit

2015-09-17 Thread Neil Roberts
The destination rectangle is now drawn at 4x4 the size and the shader code to calculate the sample number is adjusted accordingly. --- src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/

[Mesa-dev] [PATCH 06/12] i965/fs: Add a sampler program key for whether the texture is 16x MSAA

2015-09-17 Thread Neil Roberts
When 16x MSAA is used for sampling with texelFetch the compiler needs to use a different instruction which passes more arguments for the MCS data. Previously on skl+ it was unconditionally using this new instruction. However since 16x MSAA is probably going to be pretty rare, it is probably worthwh

[Mesa-dev] [PATCH 09/12] i965/fs/skl+: Fix calculating gl_SampleID for 16x MSAA

2015-09-17 Thread Neil Roberts
In order to accomodate 16x MSAA, the starting sample pair index is now 3 bits rather than 2 on SKL+. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp ind

[Mesa-dev] [PATCH 07/12] i965: Support calculating the bits needed to set up 16x MSAA

2015-09-17 Thread Neil Roberts
The gen7_surface_msaa_bits function already returns the right values for 16 samples but it just needs its assert to be relaxed. --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.

[Mesa-dev] [PATCH 12/12] i965/skl+: Enable support for 16x multisampling

2015-09-17 Thread Neil Roberts
--- src/mesa/drivers/dri/i965/brw_context.c | 6 ++ src/mesa/drivers/dri/i965/intel_screen.c | 5 - 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 7c1c133..c05fb74 100644 --- a/src/mes

[Mesa-dev] [PATCH 05/12] i965/vec4/skl+: Use lcd2dms_w instead of lcd2dms

2015-09-17 Thread Neil Roberts
In order to support 16x MSAA, skl+ has a wider version of lcd2dms that takes two parameters for the MCS data. The MCS data in the response still fits in a single register so we just need to ensure we copy both values rather than just the lower one. --- src/mesa/drivers/dri/i965/brw_vec4.cpp

[Mesa-dev] [PATCH 08/12] i965: Support allocating the MCS buffer for 16x MSAA

2015-09-17 Thread Neil Roberts
When 16 samples are used the MCS buffer needs 64 bits per pixel. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 0cb0632..9faafb4 100644

Re: [Mesa-dev] [PATCH 03/12] i965/fs: Disable SIMD16 when a sampler message would be too long

2015-09-18 Thread Neil Roberts
Francisco Jerez writes: > NAK, these cases are already handled without disabling SIMD16 by > lowering the SIMD16 message into SIMD8 halves. You just need to add a > case to get_lowered_simd_width() so that the SIMD lowering pass knows > what the maximum execution size is for your new sampler mess

Re: [Mesa-dev] [PATCH 05/12] i965/vec4/skl+: Use lcd2dms_w instead of lcd2dms

2015-09-23 Thread Neil Roberts
Ben Widawsky writes: >>} else if (op == ir_txf_ms) { >> emit(MOV(dst_reg(MRF, param_base + 1, sample_index.type, >> WRITEMASK_X), >>sample_index)); >> - if (devinfo->gen >= 7) { >> + if (opcode == SHADER_OPCODE_TXF_CMS_W) { >> +/*

Re: [Mesa-dev] [PATCH 06/12] i965/fs: Add a sampler program key for whether the texture is 16x MSAA

2015-09-23 Thread Neil Roberts
Ben Widawsky writes: > On Thu, Sep 17, 2015 at 05:00:08PM +0100, Neil Roberts wrote: >> When 16x MSAA is used for sampling with texelFetch the compiler needs >> to use a different instruction which passes more arguments for the MCS >> data. Previously on skl+ it was uncon

Re: [Mesa-dev] [PATCH 04/12] i965/fs/skl+: Use lcd2dms_w instead of lcd2dms

2015-09-23 Thread Neil Roberts
Ben Widawsky writes: >> + /* On Gen9+ we'll use lcd2ms_w instead which has two registers for >> + * the MCS data. >> + */ >> + if (op == SHADER_OPCODE_TXF_CMS_W) { >> +bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), >> +mcs.

Re: [Mesa-dev] [PATCH 06/12] i965/fs: Add a sampler program key for whether the texture is 16x MSAA

2015-09-23 Thread Neil Roberts
Ben Widawsky writes: > Hmm. As I read it, it sounded like you didn't have to send LOD it's > implied to be 0 if you don't send it. If I am wrong about that, then I > agree with you completely. I'm a bit lost. You're right that it's not necessary to send the LOD when it's zero. In fact Mesa never

[Mesa-dev] [PATCH v2] meta: Support 16x MSAA in the multisample scaled blit shader

2015-09-24 Thread Neil Roberts
v2: Fix the x_scale in the shader. Remove the doubts in the commit message. --- After some helpful explanation from Anuj and reading the code a bit more, I think I understand this a bit better and I no longer think there is an issue with the sample map array having out-of-bounds indices. The t

Re: [Mesa-dev] [PATCH 09/12] i965/fs/skl+: Fix calculating gl_SampleID for 16x MSAA

2015-09-28 Thread Neil Roberts
Anuj Phogat writes: > As per docs we're supposed to get the per slot SampleID written to > 15:0 bits in R1.0. I used SSPI to compute the SampleID because I never > got anything useful in these bits on IVB. Things might have changed on > later platforms. So, I think it's worth trying to do what do

Re: [Mesa-dev] [PATCH 00/12] i965: Add support for 16x MSAA on SKL+

2015-09-28 Thread Neil Roberts
Neil Roberts writes: > The following tests are failing but on my SKL device the corresponding > tests with 8 samples are also failing. As far as I understand these > aren't known regressions for other people so it may be something to do > with my device being pre-production. It

[Mesa-dev] [PATCH] mesa/meta: Use interpolateAtSample for 16x MSAA copy blit

2015-09-28 Thread Neil Roberts
Previously there was a problem in i965 where if 16x MSAA is used then some of the sample positions are exactly on the 0 x or y axis. When the MSAA copy blit shader interpolates the texture coordinates at these sample positions it was possible that it would jump to a neighboring texel due to roundin

Re: [Mesa-dev] [PATCH] mesa/meta: Use interpolateAtSample for 16x MSAA copy blit

2015-09-29 Thread Neil Roberts
Ilia Mirkin writes: > A couple of fairly generic comments: > > - It is not at all clear to me why it's OK to interpolate at sample 0 Yes, this was cheating a little bit. At least on Intel hardware the samples are supposed to be sorted by order of distance from the centre so sample 0 will be the

[Mesa-dev] [PATCH v2] mesa/meta: Use interpolateAtOffset for 16x MSAA copy blit

2015-09-29 Thread Neil Roberts
Previously there was a problem in i965 where if 16x MSAA is used then some of the sample positions are exactly on the 0 x or y axis. When the MSAA copy blit shader interpolates the texture coordinates at these sample positions it was possible that it would jump to a neighboring texel due to roundin

Re: [Mesa-dev] [PATCH] util: implement strndup for WIN32

2015-09-29 Thread Neil Roberts
I think this implementation will have problems if the string being copied is not null terminated. It's not clear from the man pages whether that is an allowed way to use the function but a quick Google shows up a few similar patches where they have later been fixed by using strnlen. It looks like s

Re: [Mesa-dev] [PATCH] util: use strnlen() in strndup() implementations

2015-09-29 Thread Neil Roberts
Looks good to me. Thanks for doing that. Reviewed-by: Neil Roberts - Neil Samuel Iglesias Gonsalvez writes: > If the string being copied is not NULL-terminated the result of > strlen() is undefined. > > Signed-off-by: Samuel Iglesias Gonsalvez > --- > src/util/ralloc.c

Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Neil Roberts
Francisco Jerez writes: > Sigh, it's really awful that our hardware only supports a single sample > index for the whole SIMD thread... I was thinking though that there > might be a better alternative to running the sample-index interpolator > query in a loop: The "Per Slot Offset" interpolator q

Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Neil Roberts
Matt Turner writes: >> +static fs_reg >> +get_num_samples_reg(fs_visitor *v) >> +{ >> + struct gl_program_parameter_list *params = v->prog->Parameters; >> + static gl_state_index tokens[STATE_LENGTH] = { > > I suspect this isn't thread-safe. Do you mean because the tokens array is static? I

[Mesa-dev] [PATCH 1/2] i965: Add a second successor to BRW_OPCODE_WHILE

2015-10-05 Thread Neil Roberts
It is possible to directly predicate the WHILE instruction. In this case there will be a second successor block because the execution can resume from the instruction after the loop. This will be used in a subsequent patch. --- src/mesa/drivers/dri/i965/brw_cfg.cpp | 4 1 file changed, 4 inser

[Mesa-dev] [PATCH v3 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-05 Thread Neil Roberts
If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to

[Mesa-dev] [PATCH v4] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-07 Thread Neil Roberts
If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to

[Mesa-dev] [PATCH] nir: Mark the shader name during nir_sweep

2015-10-08 Thread Neil Roberts
Previously the name of the nir shader was being freed prematurely during nir_sweep. Since 756613ed35d the name was later being used to generate filenames for the optimiser debug output and these would end up with garbage from the dangling pointer. --- src/glsl/nir/nir_sweep.c | 3 +++ 1 file chang

Re: [Mesa-dev] [PATCH] nir/sweep: Reparent the shader name

2015-10-08 Thread Neil Roberts
Oops, I just made a similar patch without noticing this one. Feel free to take the commit message from my patch if you want. Either way this one is: Reviewed-by: Neil Roberts http://patchwork.freedesktop.org/patch/61369/ Sorry for the noise. Regards, - Neil Jason Ekstrand writes

Re: [Mesa-dev] [PATCH 0/2] i965/gen9: Enable rep clears

2015-10-09 Thread Neil Roberts
Seems like a good idea to me. Series is Reviewed-by: Neil Roberts - Neil Chad Versace writes: > This series lives at > git://github.com/chadversary/mesa refs/tags/skl-fast-clear-v08.01 > > No Piglit regressions on: > - Skylake 0x1912 (rev 06) > - linux 4.3-rc4 &

[Mesa-dev] [PATCH v5] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-09 Thread Neil Roberts
If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to

Re: [Mesa-dev] [PATCH] i965/meta-fast-clear: Convert the clear color through the surf format

2016-01-13 Thread Neil Roberts
Bump. Anyone fancy reviewing this small patch? I think it would be good to have because it makes the code a bit simpler as well as fixing a corner case and making it more robust. - Neil Neil Roberts writes: > When programming the fast clear color there was previously a chunk of > code

[Mesa-dev] [PATCH] texobj: Check completeness with InternalFormat rather than Mesa format

2016-01-13 Thread Neil Roberts
The internal Mesa format used for a texture might not match the one requested in the internalFormat when the texture was created, for example if the driver is internally remapping RGB textures to RGBA. Otherwise it can cause false positives for completeness if one mipmap image is created as RGBA an

[Mesa-dev] [PATCH 1/2] texobj: Fix the completeness checks for cube textures

2016-01-21 Thread Neil Roberts
According to the GL 1.4 spec section 3.8.10, a cubemap texture is only complete if: • The level base arrays of each of the six texture images making up the cube map have identical, positive, and square dimensions. • The level base arrays were each specified with the same internal format. • The

[Mesa-dev] [PATCH 2/2] texobj: Remove redundant checks that the texture cube faces match size

2016-01-21 Thread Neil Roberts
The texture mipmap completeness checking code was checking whether all of the faces have the same size. However this is pointless because the code just above it checks whether the face has the expected size calculated for the mipmap level anyway so the error condition could never be reached. This p

[Mesa-dev] [PATCH 1/4] main: Use _mesa_geometric_samples to calculate the value of GL_SAMPLES

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/main/get.c | 4 src/mesa/main/get_hash_params.py | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 04348

[Mesa-dev] [PATCH 4/4] main: Use a derived value for the default sample count

2016-02-04 Thread Neil Roberts
Previously the framebuffer default sample count was taken directly from the value given by the application. On the i965 driver on HSW if the value wasn't one that is supported by the hardware it would hit an assert when it tried to program the state for it. This patch fixes it by adding a derived s

[Mesa-dev] [PATCH 3/4] program: Use _mesa_geometric_samples to calculate gl_NumSamples

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/program/prog_statevars.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/prog_statevars.c b/src/mesa/program/prog_statevars.c index 12490d0..eed2412 1

[Mesa-dev] [PATCH 2/4] main: Use _mesa_geometric_samples to calculate GL_SAMPLE_BUFFERS

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/main/get.c | 3 +++ src/mesa/main/get_hash_params.py | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 307059

[Mesa-dev] [PATCH] i965/skl: Don't try to apply the opt_sampler_eot extension for vs

2015-04-28 Thread Neil Roberts
The opt_sampler_eot optimisation of fs_visitor effectively assumes that it is running on a fragment shader because it casts the program key to a brw_wm_prog_key. However on Skylake fs_visitor can also be used for vertex shaders. It looks like this usually works anyway because the optimisation is sk

[Mesa-dev] [PATCH v2] i965/fs: Strip trailing constant zeroes in sample messages

2015-04-30 Thread Neil Roberts
If a send message is emitted with a message length that is less than required for the message then the remaining parameters default to zero. We can take advantage of this to save a register when a shader passes constant zeroes as the final coordinates to the sample function. I think this might be

Re: [Mesa-dev] [PATCH 5/6] i965/skl: Align compressed textures to four times the block size

2015-05-01 Thread Neil Roberts
Sorry for the really long delay in replying! This patch is still needed in order to fix a number of Piglit tests so it would be good to get it landed. Ben Widawsky writes: > Sorry for the delay, but I put this off initially because I wasn't > sure which part of the docs this was addressing. I se

Re: [Mesa-dev] [PATCH 08/13] util/list: Add C99-based iterator macros

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +#define list_for_each_entry(type, pos, head, member)\ > + for (type *pos = container_of((head)->next, pos, member);\ > + &pos->member != (head); \ > + pos = container_of(pos->member.next, p

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +static inline bool list_empty(struct list_head *list) > +{ > + return list->next == list; > +} It would be good if list.h also included stdbool.h in order to get the declaration of bool. However, will that cause problems on MSVC? Is the Gallium code compiled on MSVC i

Re: [Mesa-dev] [PATCH 10/13] util/list: Add a list validation function

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +static inline void list_validate(struct list_head *list) > +{ > + assert(list->next->prev == list && list->prev->next == list); > + for (struct list_head *node = list->next; node != list; node = node->next) > + assert(node->next->prev == node && node->prev->next

Re: [Mesa-dev] [PATCH 3/3 v5] i965/fs: Combine tex/fb_write operations (opt)

2015-05-06 Thread Neil Roberts
Hi, This optimisation doesn't seem to work with textureGather so a bunch of Piglit tests are failing for me. I'm not sure why it didn't get picked up by your Jenkins run. I can't find anything in the bspec nor a known workaround to suggest that this shouldn't work so I'm not really sure what to d

[Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Neil Roberts
opt_sampler_eot enables a direct write to framebuffer from a sample. In order to do this the sample message needs to have a message header so if there wasn't one already then the function adds one. In addition the function sets the destination register to null because it's no longer used. However i

[Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Neil Roberts
Commit 94ee908448 added a header size parameter to the function to create the LOAD_PAYLOAD instruction. However this broke opt_sampler_eot which manually constructs the instruction and so wasn't setting the header_size. This ends up making the parameters for the send message all have the wrong loca

[Mesa-dev] [PATCH] i965/fs: Disable opt_sampler_eot for textureGather

2015-05-08 Thread Neil Roberts
The opt_sampler_eot optimisation seems to break when the last instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing this so it causes a lot of regressions. I can't find any documentation or known workarounds to indicate that this is expected behaviour, but considering that this is

[Mesa-dev] [PATCH 0/2] i965: Do conditional rendering in hardware

2015-05-08 Thread Neil Roberts
I thought it might be a good idea to try posting these patches again since it's been 6 months since they were originally posted. The patches are a lot more useful now since the command parser in the kernel is working correctly for Haswell. This means the functionality is no longer restricted to onl

[Mesa-dev] [PATCH 2/2 v3] i965: Use predicate enable bit for conditional rendering w/o stalling

2015-05-08 Thread Neil Roberts
EMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Aut

[Mesa-dev] [PATCH 1/2] i965: Store the command parser version number in intel_screen

2015-05-08 Thread Neil Roberts
In order to detect whether the predicate source registers can be used in a later patch we will need to know the version number for the command parser. This patch just adds a member to intel_screen and does an ioctl to get the version. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/in

Re: [Mesa-dev] [PATCH 2/2 v3] i965: Use predicate enable bit for conditional rendering w/o stalling

2015-05-11 Thread Neil Roberts
Kenneth Graunke writes: > It might be nice to create a brw_load_register_mem64 function, for > symmetry with brw_store_register_mem64 - we might want to reuse it > elsewhere someday. Ok, that sounds sensible. > One interesting quirk: the two halves of your register write may land > in two separ

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-11 Thread Neil Roberts
Ian Romanick writes: >> For what it's worth, I'm strongly in favour of using these >> kernel-style lists instead of exec_list. The kernel ones seem much >> less confusing. > > Huh? They're practically identical. The only difference is the > kernel-style lists have a single sentinel node, and that

[Mesa-dev] [PATCH] configure: Bump libdrm requirement for Intel to 2.4.61

2015-05-12 Thread Neil Roberts
This is required for the I915_PARAM_REVISION macro. Previously this define was directly copied into the Mesa source. --- configure.ac | 2 +- src/mesa/drivers/dri/i965/intel_screen.c | 5 - 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/configure.ac b

Re: [Mesa-dev] [RFC 00/10] Enable support for 2D ASTC HDR and LDR formats

2015-05-20 Thread Neil Roberts
Jason Ekstrand writes: > I think *most* of that code *should* already be there. In theory, > it's all keyed off of the block size provided by formats.csv. > However, given some of the rendering errors we're currently seeing, it > looks like it may need a little patching here and there. :-) inte

[Mesa-dev] [PATCH] i965/skl: Add a message header for the TXF_MCS instruction in vec4vs

2015-05-21 Thread Neil Roberts
When using SIMD4x2 on Skylake, the sampler instructions need a message header to select the correct mode. This was added for most sample instructions in 0ac4c2727 but the TXF_MCS instruction is emitted separately and it was missed. This fixes a bunch of Piglit tests which test texelFetch in a geom

Re: [Mesa-dev] [PATCH] i965: Disable compaction for EOT send messages

2015-05-28 Thread Neil Roberts
atches out there to handle this. Please ignore if > this has already been sent by someone. (Direct me to it and I will > review it). > > Cc: Matt Turner > Cc: Neil Roberts > Cc: Mark Janes > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_eu_compact.c |

[Mesa-dev] [PATCH] i965/vec4: Fix the source register for indexed samplers

2015-05-28 Thread Neil Roberts
Previously when setting up the sample instruction for an indirect sampler the vec4 backend was directly passing the pseudo opcode's src0. However this isn't actually set to a valid register because instead the MRF registers are used as the source so it would end up passing null as src0. This patch

[Mesa-dev] [PATCH 1/2] i965: Don't use a temporary when generating an indirect sample

2015-05-29 Thread Neil Roberts
Previously when generating the send instruction for a sample instruction with an indirect sampler it would use the destination register as a temporary store. This breaks when used in combination with the opt_sampler_eot optimisation because that forces the destination to be null. This patch fixes t

[Mesa-dev] [PATCH 2/2] i965: Don't add base_binding_table_index if it's zero

2015-05-29 Thread Neil Roberts
When calculating the binding table index for non-constant sampler array indexing it needs to add the base binding table index which is a constant within the generated code. Often this base is zero so we can avoid a redundant instruction in that case. It looks like nothing in shader-db is doing non

Re: [Mesa-dev] [PATCH 1/2] i965: Don't use a temporary when generating an indirect sample

2015-06-01 Thread Neil Roberts
Many thanks for all the reviews and testing. I've pushed the two patches. The remaining sampler_array_indexing tests that fail on SKL (the gs ones) are because of a separate problem described in this patch: http://patchwork.freedesktop.org/patch/50676/ I'm not really sure whether that's the clea

Re: [Mesa-dev] [RFC v2 12/15] i965: correct VALIGN for 2d textures on Skylake

2015-06-01 Thread Neil Roberts
Looks good to me. Reviewed-by: Neil Roberts - Neil Anuj Phogat writes: > Adding Neil to Cc who committed 4ab8d59. > > Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman

Re: [Mesa-dev] [RFC v2 12/15] i965: correct VALIGN for 2d textures on Skylake

2015-06-02 Thread Neil Roberts
querying the block height anyway. The later patch about combining the two functions would need to be changed too. Regards, - Neil Neil Roberts writes: > Looks good to me. > > Reviewed-by: Neil Roberts > > - Neil > > Anuj Phogat writes: > >> Adding Neil to Cc who

Re: [Mesa-dev] [PATCH] i965/fs: Use UW-typed immediate in multiply inst.

2015-06-03 Thread Neil Roberts
Looks good to me. Thanks for fixing this. I guess I still have more to learn about the ISA. However, should we not also fix the vec4 version? With that, Reviewed-by: Neil Roberts If we wanted to play safe and avoid the MUL, we could change it to this and still avoid having a temporary

Re: [Mesa-dev] [PATCH] i965/vec4: Fix the source register for indexed samplers

2015-06-04 Thread Neil Roberts
Matt Turner writes: > I don't know why I was confused by this patch -- after arriving at the > same conclusion independently I see that all of the analysis I needed > was right there. Yes sorry, I probably didn't explain it very well. Your explanation is a lot clearer. > To sum up, vec4_visitor

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Don't let the EOT send message interfere with the MRF hack

2015-06-09 Thread Neil Roberts
Both patches look good to me and I can confirm they make the Piglit tests pass on Skylake. Reviewed-by: Neil Roberts My original assumption of the problem was that the implied writes from the SCRATCH_WRITE instruction aren't taken into account when calculating the liveliness of the regi

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Don't let the EOT send message interfere with the MRF hack

2015-06-09 Thread Neil Roberts
Jason Ekstrand writes: > The only place when the fact that the MRFs are virtual matters is in > register allocation. Implied MRF writes are taken into account in > setup_mrf_hack_interference. We figure out what MRFs are used and > then mark them as conflicting with *all* of the VGRFs. We also

[Mesa-dev] [PATCH 1/2] i965: Explicitly set base_mrf to -1 for pull constant loads on Gen7

2015-06-10 Thread Neil Roberts
A freshly constructed instruction defaults to having a base_mrf of 0 which means that if nothing disables it it will default to using send-from-MRF. Previously this didn't matter because the constant load instructions on Gen7 were ignoring the base_mrf anyway. However in the next patch the brw_send

[Mesa-dev] [PATCH 0/2] Fix sampler array indexing in vec4vs

2015-06-10 Thread Neil Roberts
Matt Turner writes: >> I'll have another look at moving it into brw_send_indirect_message. > > Thanks. I'm not really sure what the right solution is, so if you > decide this patch is good as is, that's fine with me. Here's what the patches would look like if we made brw_send_indirect_message lo

<    1   2   3   4   5   6   >