On 10.03.2016 02:11, Marek Olšák wrote:
> On Wed, Mar 9, 2016 at 4:31 PM, Emil Velikov wrote:
>> On 8 March 2016 at 22:29, Marek Olšák wrote:
>>
>>> Actually, I don't see how the version number would make it any better
>>> for the structures, but returning the version number by
>>> QueryDeviceInf
Hi devs,
On building 64-bit Android-x86 mesa with amdgpu support,
we got some errors about the symbols LLVMInitializeAMDGPU*
are not defined. (missing prototypes)
It's easy to fix the errors by adding the definition of
the function prototypes.
However, I'm curious in which side it should be fixed?
On Thu, Mar 10, 2016 at 4:43 AM, Michel Dänzer wrote:
> On 09.03.2016 20:29, Marek Olšák wrote:
>> On Wed, Mar 9, 2016 at 7:19 AM, Nicolai Hähnle wrote:
>>> On 02.03.2016 11:36, Marek Olšák wrote:
@@ -318,6 +343,13 @@ static boolean r600_texture_get_handle(struct
pipe_screen* scree
On Thu, Mar 10, 2016 at 9:34 AM, Michel Dänzer wrote:
> On 10.03.2016 02:11, Marek Olšák wrote:
>> On Wed, Mar 9, 2016 at 4:31 PM, Emil Velikov
>> wrote:
>>> On 8 March 2016 at 22:29, Marek Olšák wrote:
>>>
Actually, I don't see how the version number would make it any better
for the
On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> Hello,
>
> There is only one patch from this series that has been reviewed (patch
> 1).
>
> Our plans is to start sending patches for adding fp64 support to i965
> driver in the coming weeks but they depend on these patc
Reviewed-by: Marek Olšák
Why do we still need the gallium codepath?
Marek
On Tue, Mar 8, 2016 at 1:21 PM, Christian König wrote:
> From: Christian König
>
> Avoid using internal structures from another API.
>
> Signed-off-by: Christian König
> ---
> src/mesa/state_tracker/st_vdpau.c | 176
Those functions are only supported by LLVM 3.7 and later, and if you
have such a version, the AMDGPU backend must be enabled in your LLVM
build.
Marek
On Thu, Mar 10, 2016 at 10:04 AM, Chih-Wei Huang
wrote:
> Hi devs,
> On building 64-bit Android-x86 mesa with amdgpu support,
> we got some error
Nouveau needs to be tested first, I have doubts that this will work out
of the box for them as well.
Christian.
Am 10.03.2016 um 11:42 schrieb Marek Olšák:
Reviewed-by: Marek Olšák
Why do we still need the gallium codepath?
Marek
On Tue, Mar 8, 2016 at 1:21 PM, Christian König wrote:
Fro
Alright.
Marek
On Thu, Mar 10, 2016 at 11:50 AM, Christian König
wrote:
> Nouveau needs to be tested first, I have doubts that this will work out of
> the box for them as well.
>
> Christian.
>
>
> Am 10.03.2016 um 11:42 schrieb Marek Olšák:
>>
>> Reviewed-by: Marek Olšák
>>
>> Why do we still
https://bugs.freedesktop.org/show_bug.cgi?id=94381
--- Comment #6 from Fluendo dev team ---
I've tried to test with mesa master. I've compiled it, but when I replace the
dri drivers (radeon_dri.so, radeonsi_dri.so, ...) my Xorg server crashes, and I
can't make it work until I put back the origina
Useful to know if a expression is the recipient of an assignment
or not, that would be used to (for example) raise warnings of
"use of uninitialized variable" without getting a false positive
when assigning first a variable.
By default the value is false, and it is assigned to true on
the followin
https://bugs.freedesktop.org/show_bug.cgi?id=94381
--- Comment #7 from Christian König ---
(In reply to Fluendo dev team from comment #6)
> I've tried to test with mesa master. I've compiled it, but when I replace
> the dri drivers (radeon_dri.so, radeonsi_dri.so, ...) my Xorg server
> crashes, a
First, thank you all for your answers.
So if I summarize what was said, we need
Ian:
- add
- negate
- absolute value
- multiply
- reciprocal
- convert to single precision
- convert from single precision
Roland:
- sqrt
- comparaison (< / == / >)
- floor/ceil
I will contact Pat Brown (His
On Thu, Mar 10, 2016 at 3:30 PM, tournier.elie wrote:
> First, thank you all for your answers.
>
> So if I summarize what was said, we need
> Ian:
> - add
> - negate
> - absolute value
> - multiply
> - reciprocal
> - convert to single precision
> - convert from single precision
> Roland:
>
On 10/03/16 10:27, Pohjolainen, Topi wrote:
> On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
>> Hello,
>>
>> There is only one patch from this series that has been reviewed (patch
>> 1).
>>
>> Our plans is to start sending patches for adding fp64 support to i965
>> dri
Extend the MEMORY file support to differentiate between global, local
and shared memory, as well as "input" memory.
"MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
special memory type is added for this, since the actual storage of these
(e.g. UBO-s) may differ per implementati
When support for decl.Atomic and .Shared was added, tgsi_build_declaration
was not updated to propagate these properly.
Signed-off-by: Hans de Goede
---
src/gallium/auxiliary/tgsi/tgsi_build.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
b/sr
Add support for clover / OpenCL kernel input parameters.
Signed-off-by: Hans de Goede
---
.../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 18 +++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
b/sr
Hi,
Here are patches which implement the support for OpenCL kernel input
parameters we discussed. They also add the tgsi parsing bits for
adding support for global / local mem, but no implementation yet.
Regards,
Hans
___
mesa-dev mailing list
mesa-dev
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
> Add support for clover / OpenCL kernel input parameters.
>
> Signed-off-by: Hans de Goede
> ---
> .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 18
> +++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> dif
Reviewed-by: Ilia Mirkin
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
> When support for decl.Atomic and .Shared was added, tgsi_build_declaration
> was not updated to propagate these properly.
>
> Signed-off-by: Hans de Goede
> ---
> src/gallium/auxiliary/tgsi/tgsi_build.c | 6 +
Reviewed-by: Samuel Pitoiset
On 03/10/2016 04:14 PM, Hans de Goede wrote:
When support for decl.Atomic and .Shared was added, tgsi_build_declaration
was not updated to propagate these properly.
Signed-off-by: Hans de Goede
---
src/gallium/auxiliary/tgsi/tgsi_build.c | 6 ++
1 file cha
On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
Add support for clover / OpenCL kernel input parameters.
Signed-off-by: Hans de Goede
---
.../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 18 +++---
1 file changed, 15 i
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
> Extend the MEMORY file support to differentiate between global, local
> and shared memory, as well as "input" memory.
>
> "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
> special memory type is added for this, since the
On 03/09/2016 12:53 PM, Kyle Brenneman wrote:
On 03/09/2016 12:21 PM, Adam Jackson wrote:
On Wed, 2016-03-09 at 11:15 -0700, Kyle Brenneman wrote:
The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number
On Thu, Mar 10, 2016 at 9:14 AM, Hans de Goede wrote:
> Extend the MEMORY file support to differentiate between global, local
> and shared memory, as well as "input" memory.
>
> "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
> special memory type is added for this, since the
On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
wrote:
>
>
> On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
>>
>> On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede
>> wrote:
>>>
>>> Add support for clover / OpenCL kernel input parameters.
>>>
>>> Signed-off-by: Hans de Goede
>>> ---
>>> .../driver
On 03/10/2016 04:43 PM, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
wrote:
On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede
wrote:
Add support for clover / OpenCL kernel input parameters.
Signed-off-by: Hans de Goede
I'm not super familiar with this code, but it looks good to me, so:
Reviewed-by: Nicolai Hähnle
On 20.02.2016 00:13, Ilia Mirkin wrote:
Signed-off-by: Ilia Mirkin
---
src/compiler/glsl/builtin_functions.cpp | 110 +++
src/compiler/glsl/glcpp/glcpp-parse.y|
On 20.02.2016 00:13, Ilia Mirkin wrote:
Signed-off-by: Ilia Mirkin
---
src/mesa/state_tracker/st_extensions.c | 4 +-
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 60 +++---
2 files changed, 57 insertions(+), 7 deletions(-)
diff --git a/src/mesa/state_tracker/st
On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle wrote:
>> - if (c->MaxCombinedAtomicBuffers > 0)
>> + if (c->MaxCombinedAtomicBuffers > 0) {
>> extensions->ARB_shader_atomic_counters = GL_TRUE;
>> + extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
>> + }
>
>
> I believe the
On 04:27 PM - Mar 10 2016, Samuel Pitoiset wrote:
>
>
> On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
> >On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
> >>Add support for clover / OpenCL kernel input parameters.
> >>
> >>Signed-off-by: Hans de Goede
> >>---
> >> .../drivers/nouveau/codeg
Looks fine, except that you will need to lower FILE_SHADER_INPUT to
FILE_MEMORY_SHARED for Tesla because input kernel parameters are located
at s[0x10]. No need to do this for Fermi+ because it's already lowered
to c0[]. Note that input kernel parameters will be probably sticked on
c7[] after m
On Thu, Mar 10, 2016 at 11:03 AM, Pierre Moreau wrote:
> You might want to increment the address by at least
> `info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,
There's a cp.sharedOffset just for that :) However it doesn't appear
to get set anywhere...
__
On 03/10/2016 05:03 PM, Pierre Moreau wrote:
On 04:27 PM - Mar 10 2016, Samuel Pitoiset wrote:
On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede wrote:
Add support for clover / OpenCL kernel input parameters.
Signed-off-by: Hans de Goede
---
..
On Thu, Mar 10, 2016 at 11:05 AM, Samuel Pitoiset
wrote:
>> If I understand correctly, the goal is to have user inputs in a
>> `screen->uniform_bo`, and so for all generations?
>
> Sure for fermi, and probably for Tesla.
I think continuing to use the USER_PARAMS or whatever mechanism on
telsa mak
On 11:05 AM - Mar 10 2016, Ilia Mirkin wrote:
> On Thu, Mar 10, 2016 at 11:03 AM, Pierre Moreau wrote:
> > You might want to increment the address by at least
> > `info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,
>
> There's a cp.sharedOffset just for that :) However it
On Thu, 2016-03-10 at 08:32 -0700, Kyle Brenneman wrote:
> > That could work, although I would expect "vendor-specific info" to
> > mean "random, arbitrary, and probably not machine-parsable". I'd be
> > hesitant to try to impose a structure on something that's never had
> > any structure befor
On 09.03.2016 16:12, Bas Nieuwenhuizen wrote:
Clear DCC flags if necessary when binding a new sampler_view. Also
rebind all sampler views so that the sampler views that were already
bound are also up to date.
Seems mostly reasonable to me and should cover all the cases.
I don't think rebinding
ping?
I've already pushed patches 1 and 2, but the rest still require review.
On Sat, Feb 27, 2016 at 11:21 AM, Ilia Mirkin wrote:
> GL ES adds several extensions that enable the full functionality. I
> sent many of these out before on a piecemeal basis, but this unifies
> everything in one seri
On Thu, Mar 10, 2016 at 12:07 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> ---
> src/gallium/drivers/r600/r600_state_common.c | 30
>
> 1 file changed, 30 insertions(+)
>
> diff --git a/src/gallium/drivers/r600/r600_state_common.c
> b/src/gallium/drivers/r6
On Thu, Mar 10, 2016 at 5:36 PM, Marek Olšák wrote:
> On Thu, Mar 10, 2016 at 12:07 AM, Nicolai Hähnle wrote:
>> From: Nicolai Hähnle
>>
>> ---
>> src/gallium/drivers/r600/r600_state_common.c | 30
>>
>> 1 file changed, 30 insertions(+)
>>
>> diff --git a/src/galli
Clear DCC flags if necessary when binding a new sampler view.
v2: Do not reset DCC flags of bound sampler views.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/radeon/r600_texture.c | 2 --
src/gallium/drivers/radeonsi/si_descriptors.c | 10 +++---
2 files changed, 7 insertio
From: Marek Olšák
Only used indirectly when checking dirty.st != 0
---
src/mesa/state_tracker/st_context.c | 2 --
src/mesa/state_tracker/st_context.h | 2 +-
src/mesa/state_tracker/st_draw.c| 4 ++--
3 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/src/mesa/state_tracker/st_co
Do you also need to do this when validating the compute pipeline?
On Thu, Mar 10, 2016 at 11:59 AM, Marek Olšák wrote:
> From: Marek Olšák
>
> Only used indirectly when checking dirty.st != 0
> ---
> src/mesa/state_tracker/st_context.c | 2 --
> src/mesa/state_tracker/st_context.h | 2 +-
> src
On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle wrote:
- if (c->MaxCombinedAtomicBuffers > 0)
+ if (c->MaxCombinedAtomicBuffers > 0) {
extensions->ARB_shader_atomic_counters = GL_TRUE;
+ extensions->ARB_shader_atomic_cou
On 03/10/2016 01:23 AM, Ilia Mirkin wrote:
On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
wrote:
+ if (screen->base.class_3d <= NVF0_3D_CLASS &&
+ screen->base.class_3d != NVEA_3D_CLASS) {
Why? NVEA should be the same as NVF0 I think... and actually
NVEA_3D_CLASS is 0xa
On 03/10/2016 01:28 AM, Ilia Mirkin wrote:
On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
wrote:
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc
The patch makes a bit more sense to me after realizing a fallthrough was
changed to a break, so the whole patch is
Reviewed-by: Glenn Kennard
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-
Yes, please see the attached updated patch.
Thanks,
Marek
On Thu, Mar 10, 2016 at 6:00 PM, Ilia Mirkin wrote:
> Do you also need to do this when validating the compute pipeline?
>
> On Thu, Mar 10, 2016 at 11:59 AM, Marek Olšák wrote:
>> From: Marek Olšák
>>
>> Only used indirectly when checki
On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard wrote:
> On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin
> wrote:
>
>> On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle
>> wrote:
- if (c->MaxCombinedAtomicBuffers > 0)
+ if (c->MaxCombinedAtomicBuffers > 0) {
extens
v2 is Reviewed-by: Ilia Mirkin
[in the future, I'd really appreciate inline patches... had to
"manually" de-base64 the attachment... gmail, in their infinite
wisdom, doesn't provide a way to view inline attachments]
On Thu, Mar 10, 2016 at 12:09 PM, Marek Olšák wrote:
> Yes, please see the atta
From: Marek Olšák
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 35 ++
1 file changed, 35 insertions(+)
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 26e463e..27c8a47 100644
--- a/src/mesa/state_tracker
From: Marek Olšák
Radeonsi needs to know which shader stage will execute after a shader
in order to make the best decision about which shader variant to compile
first.
This is only set for VS and TES, because we don't need it elsewhere.
VS has 3 variants:
- next shader is FS
- next shader is GS
From: Marek Olšák
This allows compiling the main shader part as ES or LS.
If we get the correct hint, non-separable GLSL shaders no longer have to be
compiled as VS first, followed by LS or ES compiled on demand.
The result is that fewer shaders are compiled by piglit, but it doesn't
improve pi
On 09/03/16 20:15, Kyle Brenneman wrote:
The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.
But, Adam Jackson pointed out that a GLX extension could do the same job
more cleanly: Lo
On Thu, 10 Mar 2016 18:13:03 +0100, Ilia Mirkin wrote:
On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard wrote:
On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin
wrote:
On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle
wrote:
- if (c->MaxCombinedAtomicBuffers > 0)
+ if (c->MaxCombinedAto
On 03/10/2016 10:47 AM, Martin Peres wrote:
On 09/03/16 20:15, Kyle Brenneman wrote:
The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.
But, Adam Jackson pointed out that a GLX ex
Hi,
On 10-03-16 17:03, Samuel Pitoiset wrote:
Looks fine, except that you will need to lower FILE_SHADER_INPUT to
FILE_MEMORY_SHARED for Tesla because input kernel parameters are located at
s[0x10].
Ok, but should this be done in nv50_ir_from_tgsi.cpp ? That feels like the
wrong place to
ha
https://bugs.freedesktop.org/show_bug.cgi?id=94481
Bug ID: 94481
Summary: softpipe - access violation in img_filter_2d_nearest
Product: Mesa
Version: 11.2
Hardware: All
OS: Windows (All)
Status: NEW
Severi
https://bugs.freedesktop.org/show_bug.cgi?id=94481
Greg changed:
What|Removed |Added
Hardware|All |x86-64 (AMD64)
--
You are receiving this mail be
https://bugs.freedesktop.org/show_bug.cgi?id=94481
Greg changed:
What|Removed |Added
CC||greg.bea...@gmail.com
--
You are receiving this
On Thu, 2016-03-10 at 10:53 -0700, Kyle Brenneman wrote:
> On 03/10/2016 10:47 AM, Martin Peres wrote:
> >
> > That could be a hacky way of handling the case where multiple 3D
> > drivers could be used to drive the same GPU. This may be necessary in
> > the future if two mesa drivers support the
Uniform linking in (see link_assign_uniform_locations()) already
stores the index to the storage in ir_variable which is further
stored into nir_variable (see nir_visitor::visit(ir_variable *)).
Instead of doing uniform_num^2 string comparisons one can recur
over the uniform type the same way unif
Am 10.03.2016 um 08:47 schrieb Andreas Fänger:
>
>> -Ursprüngliche Nachricht- Von: Roland Scheidegger Gesendet:
>> Mittwoch, 9. März 2016 17:31 Betreff: Re: [Mesa-dev] [PATCH] scons:
>> build osmesa swrast and gallium
>>
>> Am 09.03.2016 um 08:41 schrieb Andreas Fänger:
-Ursprüng
From: Ian Romanick
Remove the parameter. Also, reformat the function definition to match
Mesa coding style.
brw_state_dump.c: In function ‘q_to_float’:
brw_state_dump.c:266:44: warning: unused parameter ‘integer_end’
[-Wunused-parameter]
static float q_to_float(uint32_t data, int integer_end,
From: Ian Romanick
brw_state_dump.c: In function ‘gen7_dump_sampler_state’:
brw_state_dump.c:405:22: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
for (int i = 0; i < size / 16; i++) {
^
brw_state_dump.c: In function ‘gen8_dump_ble
From: Ian Romanick
This also prevented some regressions with other patches in my local
tree.
Broadwell / Skylake
total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
instructions in affected programs: 45 -> 43 (-4.44%)
helped: 1
HURT: 0
total cycles in shared programs: 70077904 ->
From: Ian Romanick
In the results below, 2 SIMD16 shaders in Trine are lost.
G4X
total instructions in shared programs: 4012279 -> 4011108 (-0.03%)
instructions in affected programs: 116776 -> 115605 (-1.00%)
helped: 339
HURT: 0
total cycles in shared programs: 84315862 -> 84313584 (-0.00%)
cyc
From: Ian Romanick
Sandy Bridge / Ivy Bridge / Haswell
total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
instructions in affected programs: 564 -> 558 (-1.06%)
helped: 6
HURT: 0
total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
cycles in affected programs: 9768 ->
From: Ian Romanick
This enables removing ssa_201 and ssa_202 in sequences like:
vec1 ssa_200 = flt ssa_199, ssa_194
vec1 ssa_201 = b2i ssa_200
vec1 ssa_202 = i2b -ssa_201
shader-db results:
Sandy Bridge
total instructions in shared programs: 8
From: Ian Romanick
On Intel platforms that don't set lower_flrp, using bcsel instead of
flrp seems to be a small amount worse. On those platforms, the use of
flrp, bcsel, and multiply of b2f is still an active area of research.
shader-db results:
G4X / Ironlake
total instructions in shared pro
From: Ian Romanick
I don't understand why the old code was bad, but the new code is fine.
brw_state_dump.c: In function ‘brw_debug_batch’:
brw_state_dump.c:677:4: warning: cannot optimize loop, the loop counter may
overflow [-Wunsafe-loop-optimizations]
for (i = 0; i < size / 4; i += 4) {
From: Ian Romanick
No shader-db changes, but this is symmetric with the previous commit.
Signed-off-by: Ian Romanick
---
src/compiler/nir/nir_opt_algebraic.py | 4
1 file changed, 4 insertions(+)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
From: Ian Romanick
Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.
This also enables optimizations added by the next commit.
shader-db results:
G4X / Ironlake
total
This is the first round of patches to improve the way we deal with
floating point values used to represent Booleans. I have a bunch more,
but this is the set that has been stable and shows almost exclusively
improvement.
I'm not married to the first 3 patches in the series. I was trying to
silen
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
> By doing it in NIR, we have the opportunity for NIR to do additional
> optimization of the expanded code.
>
> This also enables optimizations a
Hi,
On 10-03-16 16:35, Aaron Watry wrote:
On Thu, Mar 10, 2016 at 9:14 AM, Hans de Goede wrote:
Extend the MEMORY file support to differentiate between global, local
and shared memory, as well as "input" memory.
"MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
special mem
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> This enables removing ssa_201 and ssa_202 in sequences like:
>
> vec1 ssa_200 = flt ssa_199, ssa_194
> vec1 ssa_201 = b2i ssa_200
> vec1 ssa_202 = i2b -ssa_201
Review
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> This also prevented some regressions with other patches in my local
> tree.
>
> Broadwell / Skylake
> total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
> instructions in affected programs: 45 -> 43 (-4
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared program
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> On Intel platforms that don't set lower_flrp, using bcsel instead of
> flrp seems to be a small amount worse.
Yep, that's my experience too. It's because bcsel turns into CMP+SEL,
and because of the flag register we c
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
Reviewed-by: Matt Turner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> No shader-db changes, but this is symmetric with the previous commit.
Right, i965 doesn't use these operations.
Reviewed-by: Matt Turner
___
mesa-dev mailing list
mesa-dev
On Thu, Mar 10, 2016 at 10:13 AM, Topi Pohjolainen
wrote:
> Uniform linking in (see link_assign_uniform_locations()) already
> stores the index to the storage in ir_variable which is further
> stored into nir_variable (see nir_visitor::visit(ir_variable *)).
>
> Instead of doing uniform_num^2 stri
https://bugs.freedesktop.org/show_bug.cgi?id=94481
--- Comment #1 from Greg ---
This problem also exists in mip_filter_linear_aniso(...)
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
m
On 03/10/2016 06:30 AM, tournier.elie wrote:
> First, thank you all for your answers.
>
> So if I summarize what was said, we need
> Ian:
> - add
> - negate
> - absolute value
> - multiply
> - reciprocal
> - convert to single precision
> - convert from single precision
> Roland:
> - sqrt
On Wednesday, March 9, 2016 3:18:50 PM PST Jon Turney wrote:
> On 05/03/2016 03:33, Kenneth Graunke wrote:
> > We resolved the implicit version directive when processing control lines,
> > such as #ifdef, to ensure any built-in macros exist. However, we failed
> > to resolve it when handling ordin
On Thu, Mar 10, 2016 at 1:25 PM, Ian Romanick wrote:
> From: Ian Romanick
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs
On Thu, Mar 10, 2016 at 3:24 PM, Ilia Mirkin wrote:
> On Thu, Mar 10, 2016 at 1:25 PM, Ian Romanick wrote:
>> From: Ian Romanick
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (-1.06
- There's no reason there would be only 64 operations that read from the
output of a mov from VPM, so we might smash the stack (fixes etqw trace)
- Fixes segfault where we assumed that a single-use temp had a def (fixes
2 piglit tests)
- We need to only mark progress when we actually did the
On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick wrote:
> From: Ian Romanick
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared program
Ian Romanick writes:
> From: Ian Romanick
>
> I don't understand why the old code was bad, but the new code is fine.
Probably because the *loop counter* can no longer overflow. Thus the
loop can be optimized. The fact that "i" might overflow has become
irrelevant to the warning.
(And from that
On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
wrote:
> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick wrote:
>> From: Ian Romanick
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (
Quoting Marek Olšák (2016-03-10 06:57:57)
> On Thu, Mar 10, 2016 at 3:30 PM, tournier.elie
> wrote:
> > First, thank you all for your answers.
> >
> > So if I summarize what was said, we need
> > Ian:
> > - add
> > - negate
> > - absolute value
> > - multiply
> > - reciprocal
> > - convert
From: Nicolai Hähnle
The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and
MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only
appear once.
Noticed while implementing ARB_shader_image_load_store without previously
implementing SSBO.
---
src/mesa/main/get.c
From: Nicolai Hähnle
---
src/mesa/state_tracker/st_atom_image.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/state_tracker/st_atom_image.c
b/src/mesa/state_tracker/st_atom_image.c
index d0f0c42..bf7486b 100644
--- a/src/mesa/state_tracker/st_atom_image.c
+++ b/
On 10.03.2016 12:50, Glenn Kennard wrote:
On Thu, 10 Mar 2016 18:13:03 +0100, Ilia Mirkin
wrote:
On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard
wrote:
On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin
wrote:
On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle
wrote:
- if (c->MaxCombinedA
On Thu, Mar 10, 2016 at 9:30 AM, tournier.elie wrote:
> First, thank you all for your answers.
>
> So if I summarize what was said, we need
> Ian:
> - add
> - negate
> - absolute value
> - multiply
> - reciprocal
> - convert to single precision
> - convert from single precision
> Roland:
>
On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez
wrote:
> From: Jason Ekstrand
>
> v2: Fix size/type mask to properly handle 8-bit types.
>
> Signed-off-by: Juan A. Suarez Romero
> ---
> src/compiler/nir/nir.h | 17 -
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
1 - 100 of 120 matches
Mail list logo