Saves a tiny bit of CPU overhead.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 10 +-
src/mesa/drivers/dri/i965/gen6_vs_state.c| 12 ++--
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/br
On 14 November 2014 01:48, Emil Velikov wrote:
> Hi Steven,
>
> On 14 November 2014 01:40, Steven Stewart-Gallus
> wrote:
>> But my distribution does build Mesa with debugging symbols. I have the
>> package
>> libgl1-mesa-dri-dbg installed which gives me debugging symbols such as
>> drm_intel_bo
On Thu, Nov 13, 2014 at 4:28 PM, Kristian Høgsberg wrote:
> @@ -3148,6 +3150,9 @@ fs_visitor::dump_instruction(backend_instruction
> *be_inst, FILE *file)
> case UNIFORM:
>fprintf(file, "***u%d***", inst->dst.reg + inst->dst.reg_offset);
>break;
> + case ATTR:
> + fprin
On Thu, Nov 13, 2014 at 4:28 PM, Kristian Høgsberg wrote:
> The LOAD_PAYLOAD opcode can't saturate its sources, so skip
> saturating MOVs. The register coalescing after lower_load_payload()
> will clean up the extra MOVs.
>
> Signed-off-by: Kristian Høgsberg
> ---
> src/mesa/drivers/dri/i965/br
On 14.11.2014 08:43, Aaron Watry wrote:
> Walk the array of cbufs backwards and free all of them.
>
> v3: Rebase on top of changes since Aug 2014
>
> Signed-off-by: Aaron Watry
> ---
> src/gallium/drivers/r600/evergreen_compute.c | 9 +
> 1 file changed, 9 insertions(+)
>
> diff --gi
On Thu, Nov 13, 2014 at 7:28 PM, Kristian Høgsberg wrote:
> This will be reused for the scalar VS pass.
>
> Signed-off-by: Kristian Høgsberg
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 132
> +++
> src/mesa/drivers/dri/i965/brw_fs.h | 1 +
> 2 files change
On Thu, Nov 13, 2014 at 7:04 PM, Ilia Mirkin wrote:
> On Thu, Nov 13, 2014 at 7:54 PM, Aaron Watry wrote:
>>
>> On Thu, Nov 13, 2014 at 6:22 PM, Ilia Mirkin wrote:
>> > On Thu, Nov 13, 2014 at 6:43 PM, Aaron Watry wrote:
>> >> dlopen allocates a string on dlopen failure which is retrieved via
Hi Steven,
On 14 November 2014 01:40, Steven Stewart-Gallus
wrote:
> But my distribution does build Mesa with debugging symbols. I have the package
> libgl1-mesa-dri-dbg installed which gives me debugging symbols such as
> drm_intel_bo_wait_rendering and drm_intel_bo_subdata. I assume I don't hav
But my distribution does build Mesa with debugging symbols. I have the package
libgl1-mesa-dri-dbg installed which gives me debugging symbols such as
drm_intel_bo_wait_rendering and drm_intel_bo_subdata. I assume I don't have is
debugging information for JITted code although maybe the problem is a
On Thu, Nov 13, 2014 at 7:54 PM, Aaron Watry wrote:
>
> On Thu, Nov 13, 2014 at 6:22 PM, Ilia Mirkin wrote:
> > On Thu, Nov 13, 2014 at 6:43 PM, Aaron Watry wrote:
> >> dlopen allocates a string on dlopen failure which is retrieved via
> >> dlerror. In
> >> order to free that string, you need t
On Thu, Nov 13, 2014 at 6:22 PM, Ilia Mirkin wrote:
> On Thu, Nov 13, 2014 at 6:43 PM, Aaron Watry wrote:
>> dlopen allocates a string on dlopen failure which is retrieved via dlerror.
>> In
>> order to free that string, you need to retrieve and then free it.
>
> Are you basically saying that gl
With scalar vertex shader coming up, we're going to reuse brw_vec4_prog_data
in the scalar backend. There's nothing vec4 specific in the struct, it's
instead common state for stages that operate on VUEs. This patch renames
the struct to brw_vue_prog_data which is more descriptive and will look a
Now that the caller passes in the shader debug name, we don't need this
anymore.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 2 +-
src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +-
src/mesa/drivers/dri/i965/brw_fs.h | 2 --
src/mesa/d
We split out SIMD8 and SIMD16 generation into seperate calls to
new method generate_code(), which returns the start offset for the
generated code. A new get_assembly() method returns the generated code.
This avoids asserting MESA_SHADER_FRAGMENT and accessing wm_prog_data
in the generator.
Signe
The LOAD_PAYLOAD opcode can't saturate its sources, so skip
saturating MOVs. The register coalescing after lower_load_payload()
will clean up the extra MOVs.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 6 +-
1 file changed, 5 insertions(+), 1
These last few operations all only apply when we've actually generated code,
optimized and allocated registers. The dummy and the repclear shaders don't
touch uncompressed_stack, don't need the gen4 send workaround, and don't
spill. This means we can move these lines into the else-branch, which w
This flag signals that we have a SIMD8 VS shader so we can set up the
corresponding state accordingly. This boils down to setting
the BDW+ SIMD8 enable bit in 3DSTATE_VS and making UBO and pull
constant buffers use dword pitch.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_
With everything in place, we can now use the scalar backend compiler for
vertex shaders on BDW+. We make scalar vertex shaders the default on
BDW+ but add a new vec4vs debug option to force the vec4 backend.
No piglit regressions.
Performance impact is minimal, I see a ~1.5 improvement on the T-
This patch uses the previous refactoring to add a new run_vs() method
that generates vertex shader code using the scalar visitor and
optimizer.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 99 -
src/mesa/drivers/dri/i965/brw_fs.h | 21 +-
Hi,
Here's v2 of the patch series. It incorportes Matts review comments and
adds a new patch to refactor the way we call fs_generator. The idea is
to get rid of the MESA_SHADER_FS assertion in generate_assembly)() in a
nicer way. Now we call generate_code() two times with different dispatch
wit
We'll reuse this toplevel optimization driver for the scalar VS.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 136 ++-
src/mesa/drivers/dri/i965/brw_fs.h | 1 +
2 files changed, 72 insertions(+), 65 deletions(-)
diff --git a/src
This will be reused for the scalar VS pass.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 132 +++
src/mesa/drivers/dri/i965/brw_fs.h | 1 +
2 files changed, 71 insertions(+), 62 deletions(-)
diff --git a/src/mesa/drivers/dri/i96
fs_generator no longer knows what stage it's generating code for, so
we have to set the debug name of the shader from the call site.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 4 +++-
src/mesa/drivers/dri/i965/brw_fs.cpp| 17 --
Now that fs_visitor::run is back to being only fragment
shader compilation, we can clean up a few stage == MESA_SHADER_FRAGMENT
conditions and rename it to run_fs.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 31 +--
src/mesa/drivers/dri
The scalar vertex shader will use the ATTR register file for vertex
attributes. This patch adds support for the ATTR file to fs_visitor.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 12 ++--
src/mesa/drivers/dri/i965/brw_fs.h | 3 +++
This removes all stage specific data from the generator, and lets us
create a generator for any stage.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 5 ++---
src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +-
src/mesa/drivers/dri/i965/brw_fs.h
This chunk of code is repeated in a few places, and we're going to add
a MESA_SHADER_VERTEX case to it soon.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 37
1 file changed, 16 insertions(+), 21 deletions(-)
diff --git a/src/me
This is all we need from the generator for SIMD8 vertex shaders. This
opcode is just the send instruction, all the hard work will happen
in the visitor using LOAD_PAYLOAD.
Signed-off-by: Kristian Høgsberg
---
src/mesa/drivers/dri/i965/brw_defines.h | 3 +++
src/mesa/drivers/dri/i965/
https://bugs.freedesktop.org/show_bug.cgi?id=86070
--- Comment #8 from Nicholas Yue ---
(In reply to Sinclair Yeh from comment #7)
> This issue has been fixed by ce9a3a8997d86f3bf387f23578972acb5b16ac4ac,
> which is in MESA 10.1.0 onwards. The fix is not trivial to back port to
> MESA 8.0.4 and
On Thu, Nov 13, 2014 at 6:43 PM, Aaron Watry wrote:
> dlopen allocates a string on dlopen failure which is retrieved via dlerror. In
> order to free that string, you need to retrieve and then free it.
Are you basically saying that glibc leaks memory and you're trying to
make up for it? What if yo
v3: Rebase and add #if guards
v2: fix indentation
Signed-off-by: Aaron Watry
---
src/gallium/drivers/r600/evergreen_compute.c | 19 +++
1 file changed, 19 insertions(+)
diff --git a/src/gallium/drivers/r600/evergreen_compute.c
b/src/gallium/drivers/r600/evergreen_compute.c
inde
Walk the array of cbufs backwards and free all of them.
v3: Rebase on top of changes since Aug 2014
Signed-off-by: Aaron Watry
---
src/gallium/drivers/r600/evergreen_compute.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/src/gallium/drivers/r600/evergreen_compute.c
b/src/galliu
shader->code_bo was leaked VRAM
shader->bc.bytecode, shader->binary.* were leaked system memory.
Signed-off-by: Aaron Watry
---
src/gallium/drivers/r600/evergreen_compute.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/src/gallium/drivers/r600/evergreen_compute.c
b/src/gallium/driv
dlopen allocates a string on dlopen failure which is retrieved via dlerror. In
order to free that string, you need to retrieve and then free it.
In order to keep things legit the windows/other util_dl_error paths allocate
and then copy their error message into a buffer as well.
Signed-off-by: Aar
On Wed, Nov 12, 2014 at 11:15 AM, Matt Turner wrote:
>
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 6 --
> src/mesa/drivers/dri/i965/brw_fs.h | 1 -
> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +-
> 3 files changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/s
On Wed, 2014-11-12 at 21:47 +0200, Juha-Pekka Heikkila wrote:
> On 12.11.2014 19:36, Bruno Jimenez wrote:
> > On Wed, 2014-11-12 at 14:50 +0200, Juha-Pekka Heikkila wrote:
> >> Signed-off-by: Juha-Pekka Heikkila
> >> ---
> >> src/mesa/Makefile.am | 8 +++
> >> src/mesa/main/sse2_clampi
FWIW opencl explicit conversion instructions have optional rounding mode
modifiers.
Roland
Am 13.11.2014 um 21:19 schrieb Jose Fonseca:
> I've eliminated our internal dependency on TGSI_OPCODE_CND (by replacing
> SUB+CMP). So you can commit the change to remove it as far as I'm concerned.
>
>
https://bugs.freedesktop.org/show_bug.cgi?id=86070
Sinclair Yeh changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
I've eliminated our internal dependency on TGSI_OPCODE_CND (by replacing
SUB+CMP). So you can commit the change to remove it as far as I'm concerned.
I have mixed feelings about ARR, because the operation it does is essentially
an "iround()", i.e., "(int)roundf()", and at least when targeting
Initially TGSI used to be an union of all possible opcodes (NV/ARB fp/vp, Mesa
IR, D3D Shader Model 1.x, 2.x, more recently D3D10).
But in practice it's just too much of a hassle, and many of the opcodes were
never handled or generated. Many received little to no testing.
Particularly when impl
On Thu, Nov 13, 2014 at 09:47:10AM -0800, Eric Anholt wrote:
> Kenneth Graunke writes:
>
> > On Wednesday, November 12, 2014 06:54:31 PM Ben Widawsky wrote:
> >> Every other unit in the geometry pipeline automatically enables
> >> statistics gathering. This part of the pipe has been controlled by
On Thursday, November 13, 2014 03:09:22 PM Francisco Jerez wrote:
> Kenneth Graunke writes:
>
> > On Wednesday, November 12, 2014 09:57:30 PM Matt Turner wrote:
> >> On Wed, Nov 12, 2014 at 9:35 PM, Kenneth Graunke
> > wrote:
> >> > +vec4_visitor::emit_math(enum opcode opcode,
> >> > +
It looks like ARR is mildly useful though as hw often can implement it
natively and it benefits at least one state tracker (not that and
optimizing backend couldn't recognize round+arl but llvmpipe wouldn't at
least right now).
So, maybe it would be better to keep it for now.
Roland
Am 13.11.2014
Nothing in the tree generated it.
---
The rest of the rebase to deal with the conflicts with this can be found at
tgsi-opcode-nuke-2 of my Mesa tree. (CND is also left in there)
src/gallium/auxiliary/gallivm/lp_bld_tgsi.c | 1 -
src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c | 6 --
sr
From: Thierry Reding
The DRM_IOCTL_MODE_CREATE_DUMB (and others) IOCTL isn't very rigorously
specified, which has the effect that some kernel drivers do not consider
the .pitch and .size fields of struct drm_mode_create_dumb outputs only.
Instead they will use these as lower bounds and overwrite
Kenneth Graunke writes:
> On Wednesday, November 12, 2014 06:54:31 PM Ben Widawsky wrote:
>> Every other unit in the geometry pipeline automatically enables
>> statistics gathering. This part of the pipe has been controlled by the
>> DEBUG_STATS variable, but this is asymmetric. This dates back t
On Wed, Aug 06, 2014 at 09:56:30PM +0300, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä
>
> I had a few rainy days during my summer vacation so I decided to fix a
> chromnium-bsu texturing problem that was nagging me for a while now. I
> ended up fixing a few other things too that I
As long as we have NAND, pretty much anything can be lowered to
that... I am, of course, not advocating keeping around every insane
instruction, but it does seem a bit arbitrary as to which ones we have
and which ones we don't... I am personally guilty of adding a bunch,
and it was never clear to m
This looks good to me. Other candidates for removal:
SUB (same as ADD with the Negate bit inverted)
CLAMP (same as MIN+MAX), some drivers don't implement this
ABS (same as MOV with the Abs bit set)
Marek
On Thu, Nov 13, 2014 at 2:18 AM, Eric Anholt wrote:
> This series removes a bunch of unus
Nine can lower ARR into ROUND+ARL easily.
Marek
On Thu, Nov 13, 2014 at 3:33 PM, Jose Fonseca wrote:
> It looks like ARR is generated, as
> src/gallium/state_trackers/nine/nine_shader.c has
>
> #define _OPI(o,t,vv1,vv2,pv1,pv2,d,s,h) \
> { D3DSIO_##o, TGSI_OPCODE_##t, { vv1, vv2 }, { pv1, p
On 11/12/2014 06:18 PM, Eric Anholt wrote:
This series removes a bunch of unused opcodes, mostly from TGSI. It
doesn't go as far as we could possibly go -- while I welcome discussion
for future patch series deleting more, I hope that discussion doesn't
derail the review process for these changes
On 11/13/2014 08:32 AM, Neil Roberts wrote:
---
src/glsl/linker.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index bd2aa3c..41d6a82 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2411,7 +2411,7 @@ reserve_expl
---
src/glsl/linker.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index bd2aa3c..41d6a82 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2411,7 +2411,7 @@ reserve_explicit_locations(struct gl_shader_program *prog,
On Thu, Nov 13, 2014 at 11:10:39AM +, Jose Fonseca wrote:
> Hi Tom,
>
> That's peculiar. It looks like pthreads got into a weird state somehow.
> Don't precisely understand how though. Maybe there's a race inside
> pipe_semaphore_signal() with the destruction of the semaphore.
>
> I think
It looks like ARR is generated, as
src/gallium/state_trackers/nine/nine_shader.c has
#define _OPI(o,t,vv1,vv2,pv1,pv2,d,s,h) \
{ D3DSIO_##o, TGSI_OPCODE_##t, { vv1, vv2 }, { pv1, pv2, }, d, s, h }
[...]
_OPI(MOVA, ARR, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
Jose
_
Reviewed-by: Marek Olšák
Marek
On Wed, Nov 12, 2014 at 1:37 PM, wrote:
> From: José Fonseca
>
> The latest version of the specs explicitly allow it, and given that Mesa
> universally supports KHR_debug we should definitely support it.
>
> Totally untested. (Just happened to noticed this whil
Kenneth Graunke writes:
> We did this for fs_reg a while back, and it's generally a good idea.
>
I disagree, explicit constructors aren't a one-size-fits-all. IMO there
are three scenarios in which explicit constructors may be a good idea:
- Cases where your constructor may lose relevant infor
Kenneth Graunke writes:
> On Wednesday, November 12, 2014 09:57:30 PM Matt Turner wrote:
>> On Wed, Nov 12, 2014 at 9:35 PM, Kenneth Graunke
> wrote:
>> > +vec4_visitor::emit_math(enum opcode opcode,
>> > + dst_reg dst, src_reg src0, src_reg src1)
>>
>> I think you can ma
Kenneth Graunke writes:
> We do this almost everywhere else; this should make it easier to modify.
>
> Signed-off-by: Kenneth Graunke
For this patch:
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4.h | 98
> +++-
> 1 file changed, 41 i
Kenneth Graunke writes:
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4.h | 18 --
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 ++-
> 2 files changed, 14 insertions(+), 15 deletions(-)
>
> di
Thanks for doing this. It's has been long overdue.
Unfortunately we are relying on TGSI_OPCODE_CND/TGSI_OPCODE_ARR internally.
I'm also interested in cutting down used opcodes, so I'll try to replace their
usage with something else. But until then please hold on to those two patches.
The res
Reviewed-by: Marek Olšák
I suggest pasting the commit message into the code.
Marek
On Thu, Nov 13, 2014 at 7:52 AM, Michel Dänzer wrote:
> From: Michel Dänzer
>
> Using the asynchronous DMA engine for multi-dimensional operations seems
> to cause random GPU lockups for various people. While t
Hi Tom,
That's peculiar. It looks like pthreads got into a weird state somehow. Don't
precisely understand how though. Maybe there's a race inside
pipe_semaphore_signal() with the destruction of the semaphore.
I think the best thing for now is to revert to old behavior for non-windows
platfo
On 10/30/2014 12:29 AM, Ilia Mirkin wrote:
On Mon, Oct 27, 2014 at 6:34 AM, Alexandre Courbot wrote:
GK20A does not have dedicated VRAM, therefore allocating in VRAM can be
sub-optimal and sometimes even harmful. Set its VRAM domain to
NOUVEAU_BO_GART so all objects are allocated in system memo
https://bugs.freedesktop.org/show_bug.cgi?id=86195
--- Comment #5 from Iaroslav Andrusyak ---
Created attachment 109395
--> https://bugs.freedesktop.org/attachment.cgi?id=109395&action=edit
error
--
You are receiving this mail because:
You are the assignee for the bug.
___
https://bugs.freedesktop.org/show_bug.cgi?id=86195
--- Comment #4 from Iaroslav Andrusyak ---
DRAW_USE_LLVM=0 does not help, and there is no output in console from LW,
Lightswork totally silent. I have only several logs in lightswork folder.
StdErr.log and error.log
--
You are receiving this
On Wednesday, November 12, 2014 09:57:30 PM Matt Turner wrote:
> On Wed, Nov 12, 2014 at 9:35 PM, Kenneth Graunke
wrote:
> > +vec4_visitor::emit_math(enum opcode opcode,
> > + dst_reg dst, src_reg src0, src_reg src1)
>
> I think you can make the arguments const references t
https://bugs.freedesktop.org/show_bug.cgi?id=86195
--- Comment #3 from Michel Dänzer ---
(In reply to Iaroslav Andrusyak from comment #2)
> stderr
Did that crash as well? There's only one LLVM dump in there, and no immediate
sign of a crash. If it did crash, can you try again with R600_DEBUG=vs,
https://bugs.freedesktop.org/show_bug.cgi?id=86195
--- Comment #2 from Iaroslav Andrusyak ---
Created attachment 109393
--> https://bugs.freedesktop.org/attachment.cgi?id=109393&action=edit
stderr
--
You are receiving this mail because:
You are the assignee for the bug.
__
69 matches
Mail list logo