The following 2 patches make it possible to run Mesa programs on GK20A
(Tegra K1).
GK20A is very similar to GK104, but uses a new (backward-compatible) 3D class
as well as the same ISA as GK110 (SM35). Taking these differences into account
is sufficient to successfully render simple off-screen buf
GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
the GK110 path when this chip is detected.
Signed-off-by: Alexandre Courbot
---
src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 2 +-
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 +-
.../drivers/nouv
GK20A is mostly compatible with GK104, but features a new 3D
class. Add it to the relevant header and use it when GK20A is
detected.
Signed-off-by: Alexandre Courbot
---
src/gallium/drivers/nouveau/nv_object.xml.h| 1 +
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 9 -
2 files ch
https://bugs.freedesktop.org/show_bug.cgi?id=79230
Andreas Boll changed:
What|Removed |Added
CC||i...@freedesktop.org
Keywords|
https://bugs.freedesktop.org/show_bug.cgi?id=79039
Andreas Boll changed:
What|Removed |Added
Depends on||79230
--
You are receiving this mail bec
https://bugs.freedesktop.org/show_bug.cgi?id=78914
--- Comment #16 from Florian Link ---
This is strange, since in my renderer I render all back faces to one FBO and
all front faces to another FBO, so the front/back faces do not fight in the
Z-buffer.
I really experience missing pixels on faces
https://bugs.freedesktop.org/show_bug.cgi?id=78914
--- Comment #17 from Florian Link ---
Ok, you are right, it only happens with depth test enabled.
The strange thing is that it creates these artifacts in my ray caster,
where I get exactly theses holes but both front/back faces have the same
int
https://bugs.freedesktop.org/show_bug.cgi?id=79294
Priority: medium
Bug ID: 79294
Assignee: mesa-dev@lists.freedesktop.org
Summary: Xlib-based build broken on non x86/x86-64
architectures
Severity: blocker
Classification:
https://bugs.freedesktop.org/show_bug.cgi?id=79039
Andreas Boll changed:
What|Removed |Added
Depends on||79294
--
You are receiving this mail bec
https://bugs.freedesktop.org/show_bug.cgi?id=79294
Andreas Boll changed:
What|Removed |Added
CC||i...@freedesktop.org
Keywords|
https://bugs.freedesktop.org/show_bug.cgi?id=78914
--- Comment #18 from Florian Link ---
Ok, I can confirm that it is a depth fighting problem and found a fix for my
ray caster. Thank you for your effort!
Still it would be good if LLVM pipe would do the same quality depth test as
softpipe and NV
https://bugs.freedesktop.org/show_bug.cgi?id=79230
Emilio Pozuelo Monfort changed:
What|Removed |Added
CC||poch...@gmail.com
--
You are r
mesaVisual can be NULL with configless context since this commit:
commit 551d459af421a2eb937e9e16301bb64da4624f89
Author: Neil Roberts
Date: Fri Mar 7 18:05:47 2014 +
Add the EGL_MESA_configless_context extension
...
Previously the i965 and i915 drivers were explicitly
On 27/05/14 05:46, Kertesz Laszlo wrote:
> On 05/19/2014 06:06 PM, Kai Wasserbäch wrote:
>> Michel Dänzer schrieb am 19.05.2014 04:12:
>>> On 18.05.2014 18:37, Kai Wasserbäch wrote:
And instead of just not starting, my X starts crashing, whenever
libGL fails to load a (32 bit) driver
https://bugs.freedesktop.org/show_bug.cgi?id=78679
The c->runtime_check_aads_emit field has been unused since the removal of the
old ARB_fragment_shader backend in commit
098acf6c84333edbb7b1228545e4bdb2572ee0cd.
This field was relevant in Gen < 6 to do proper rendering of polygons in a
scenario
In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
has to be sent as part of the framebuffer write SEND message. This affects the
specific case where we have setup antialiased line rendering and we render
polygons which have one face setup in GL_LINE mode (line antialias
When a instruction stream ends in a block structure (like a IF/ELSE/ENDIF) the
last block's end pointer will not be set, leading to a crash later on in
fs_live_variables::setup_def_use().
If we have not assigned the end pointer of the last block, set it to the last
instruction.
---
src/mesa/drive
In Gen < 6 AA data will or will not be sent as part of the framebuffer write
SEND message depending on a runtime condition, so don't bother moving AA
data to the corresponding MRF register until we know that we need to send it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
---
src/
https://bugs.freedesktop.org/show_bug.cgi?id=78914
--- Comment #19 from Roland Scheidegger ---
depth test as such is as accurate as it could be. Doing interpolation with as
much precision as possible is not all that easy due to properties of floating
point arithmetic. In particular for the math t
Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
the meta path.
No piglit regressions on IVB.
Signed-off-by: Topi Pohjolainen
Cc: Eric Anholt
Cc: Matt Turner
Cc: Kenneth Graunke
Cc: Anuj Phogat
Cc: "
Signed-off-by: Leo Liu
---
src/gallium/include/pipe/p_video_state.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/gallium/include/pipe/p_video_state.h
b/src/gallium/include/pipe/p_video_state.h
index 0256a8f..6621dbd 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gal
Signed-off-by: Leo Liu
---
src/gallium/state_trackers/omx/vid_enc.c | 11 +--
src/gallium/state_trackers/omx/vid_enc.h | 1 +
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/gallium/state_trackers/omx/vid_enc.c
b/src/gallium/state_trackers/omx/vid_enc.c
index ee31452
Signed-off-by: Leo Liu
---
src/gallium/drivers/radeon/radeon_vce.c| 6 --
src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/radeon/radeon_vce.c
b/src/gallium/drivers/radeon/radeon_vce.c
index 222f
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.
Signed-off-by: Connor Abbott
---
src/glsl/tests/optimization-test | 7 +++
1 file changed, 7 insertions(+)
diff --git a/src/glsl/tests/optimization-test b/src/glsl/tests/optim
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_ca
In 088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which
They were made unneccesary by the last commit.
Signed-off-by: Connor Abbott
---
src/glsl/tests/lower_jumps/.gitignore | 2 ++
src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 12 -
.../lower_jumps/lower_breaks_1.opt_test.expected | 4 ---
src/glsl/tests/lower_jumps/l
While trying to modify the lower_jumps unit tests to account for my SSA
changes, I realized that the tests were not in sync with the file that
generated them. There were two problems:
-The *.expected files all had the same number of digits after the
decimal place (6) whereas the *.out files had 1
Am 27.05.2014 16:12, schrieb Leo Liu:
Signed-off-by: Leo Liu
Reviewed and pushed upstream.
Thanks,
Christian.
---
src/gallium/include/pipe/p_video_state.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/gallium/include/pipe/p_video_state.h
b/src/gallium/include/pipe/p_video_st
On Tue, May 27, 2014 at 2:35 AM, Alexandre Courbot wrote:
> On 05/27/2014 02:29 PM, Ilia Mirkin wrote:
>>
>> On Tue, May 27, 2014 at 12:59 AM, Alexandre Courbot
>> wrote:
>>>
>>> GK20A is mostly compatible with GK104, but features a new 3D
>>> class. Add it to the relevant header and use it when
On Tue, May 27, 2014 at 3:03 AM, Alexandre Courbot wrote:
> GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
> the GK110 path when this chip is detected.
>
> Signed-off-by: Alexandre Courbot
Reviewed-by: Ilia Mirkin
> ---
> src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
On Tue, May 27, 2014 at 6:21 AM, Topi Pohjolainen
wrote:
> Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
> es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
> the meta path.
>
> No piglit regressions on IVB.
>
> Signed-off-by: Topi Pohjolainen
> Cc: E
From: Christoph Bumiller
Marek v2: add a cap
Signed-off-by: Marek Olšák
---
src/gallium/auxiliary/tgsi/tgsi_strings.c| 1 +
src/gallium/auxiliary/tgsi/tgsi_ureg.c | 16
src/gallium/auxiliary/tgsi/tgsi_ureg.h | 4
src/gallium/docs/source/scree
On 05/27/2014 03:31 AM, Lubomir Rintel wrote:
> mesaVisual can be NULL with configless context since this commit:
>
> commit 551d459af421a2eb937e9e16301bb64da4624f89
> Author: Neil Roberts
> Date: Fri Mar 7 18:05:47 2014 +
>
> Add the EGL_MESA_configless_context extension
>
On 05/27/2014 06:21 AM, Topi Pohjolainen wrote:
> Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
> es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
> the meta path.
>
> No piglit regressions on IVB.
>
> Signed-off-by: Topi Pohjolainen
> Cc: Eric Anho
regs_written is in units of virtual GRFs.
---
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 171f063..b51ecc1 100644
--- a/src/
Reviewed-by: Chris Forbes
On Wed, May 28, 2014 at 10:27 AM, Matt Turner wrote:
> regs_written is in units of virtual GRFs.
> ---
> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor
Signed-off-by: Axel Davy
---
src/gallium/state_trackers/dri/drm/dri2.c | 43 ---
1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/src/gallium/state_trackers/dri/drm/dri2.c
b/src/gallium/state_trackers/dri/drm/dri2.c
index b5bc16b..f01257a 100644
--- a/sr
Signed-off-by: Axel Davy
---
src/Makefile.am | 4 +++-
src/loader/Makefile.am | 21 ---
src/loader/loader.c | 27 +
src/mesa/drivers/dri/common/xmlconfig.h | 2 ++
From: Keith Packard
Upper levels of the stack use base.stamp to tell when a drawable needs to be
revalidated, but the dri state tracker was using dPriv->lastStamp. Those two,
along with dri2.stamp, all get simultaneously incremented when a dri2
invalidate event was delivered, and so end up contai
Signed-off-by: Axel Davy
---
src/egl/drivers/dri2/egl_dri2.h | 5 +-
src/egl/drivers/dri2/platform_wayland.c | 171 ++--
2 files changed, 142 insertions(+), 34 deletions(-)
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index
From: Keith Packard
Provide the hook to pull textures out of __DRIimage structures and use them as
renderbuffers.
Signed-off-by: Keith Packard
---
src/gallium/state_trackers/dri/drm/dri2.c | 238 +-
1 file changed, 230 insertions(+), 8 deletions(-)
diff --git a/src
From: Ben Skeggs
Signed-off-by: Ben Skeggs
Signed-off-by: Keith Packard
---
src/gallium/state_trackers/dri/drm/dri2.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/src/gallium/state_trackers/dri/drm/dri2.c
b/src/gallium/state_trackers/dri/drm/dri
Signed-off-by: Axel Davy
---
src/glx/dri3_glx.c | 235 +++-
src/glx/dri3_priv.h | 2 +
2 files changed, 200 insertions(+), 37 deletions(-)
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 3d8a662..54030bb 100644
--- a/src/glx/dri3_glx.
Signed-off-by: Axel Davy
---
src/loader/loader.c | 188
src/loader/loader.h | 7 ++
2 files changed, 195 insertions(+)
diff --git a/src/loader/loader.c b/src/loader/loader.c
index 666d015..3d504f7 100644
--- a/src/loader/loader.c
+++ b/src/l
This improves GLX DRI3 Gpu offloading significantly on cpu
bound benchmarks particularly.
No performance impact for DRI2 Gpu offloading.
Signed-off-by: Axel Davy
---
src/gallium/drivers/radeonsi/si_blit.c | 15 +++
1 file changed, 15 insertions(+)
diff --git a/src/gallium/drivers/ra
Currently Gpu offloading is supported only with GLX DRI2.
You need to set it up with xrandr, and you need a DDX loaded for
the secondary device, even if it has no screen.
You use the DRI_PRIME env var to set up which Gpu the application
should use. Unfortunately it has some issues: Rendering to a p
It allows to blit two __DRIimages.
Signed-off-by: Axel Davy
---
include/GL/internal/dri_interface.h | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/include/GL/internal/dri_interface.h
b/include/GL/internal/dri_interface.h
index 4d57d0b..2ee3164 100644
--- a/inclu
Signed-off-by: Axel Davy
---
src/mesa/drivers/dri/common/xmlconfig.c | 29 +
src/mesa/drivers/dri/common/xmlconfig.h | 7 ++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/common/xmlconfig.c
b/src/mesa/drivers/dri/common/xmlc
Previously, we set up new entries in the params[] array on every access
of a rectangle texture. Unfortunately, we only reserve space for
(2 * MaxTextureImageUnits) extra entries, so programs which accessed
rectangle textures more times than that would write off the end of the
array and likely cras
In 088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which
While trying to modify the lower_jumps unit tests to account for my SSA
changes, I realized that the tests were not in sync with the file that
generated them. There were two problems:
-The *.expected files all had the same number of digits after the
decimal place (6) whereas the *.out files had 1
They were made unneccesary by the last commit.
Signed-off-by: Connor Abbott
---
src/glsl/tests/lower_jumps/.gitignore | 2 ++
src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 12 -
.../lower_jumps/lower_breaks_1.opt_test.expected | 4 ---
src/glsl/tests/lower_jumps/l
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.
v2: make sure we parse the right directories and check for existance the
right way.
Signed-off-by: Connor Abbott
---
src/glsl/tests/optimization-test | 8
1 file changed,
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_ca
Will get more complicated when fs_reg src becomes a pointer.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +
src/mesa/drivers/dri/i965/brw_fs.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bd77e0c..5b
Here's a respin of my load_payload series from mid-April with some
feedback from Ken addressed and some bugs fixed.
This series is available in my tree (with a few unrelated patches
before it)
git://people.freedesktop.org/~mattst88/mesa tex-sources
This is a prep series for implementing SSA i
The fs_reg src array is going to turn into a pointer and we'd rather not
consider the implications of shallow copying fs_insts.
Reviewed-by: Topi Pohjolainen
---
src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
b/src/mes
But only into non-load_payload instructions. Otherwise we would prevent
register coalescing from combining identical payloads.
---
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 22 ++
1 file changed, 22 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_pr
---
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 33
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index e40567f..5037579 100644
--- a/src/mesa/drivers/dri/i965/brw
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +++---
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 ++--
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +-
.../dri/i965/brw_fs_dead_code_eliminate.cpp| 2 +-
src/mesa/drivers/dri/i965/brw_fs
---
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 +++
1 file changed, 73 insertions(+), 62 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index b51ecc1..10ec254 100644
--- a/src/mesa/drivers/dri/
In a fashion suggested by Ken.
---
Allocating fewer sources than 3 is not handled in this series.
src/mesa/drivers/dri/i965/brw_fs.cpp | 90 ++--
src/mesa/drivers/dri/i965/brw_fs.h | 17 ---
2 files changed, 32 insertions(+), 75 deletions(-)
diff --git a/src
Also add an emit() function that calls it.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 12
src/mesa/drivers/dri/i965/brw_fs.h | 3 +++
2 files changed, 15 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1f174d3..c86cb42
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++--
src/mesa/drivers/dri/i965/brw_fs.h | 3 ++-
2 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b06966a..a9a8ac1 100644
--- a/src/
Clean up with with register_coalesce()/dead_code_eliminate().
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 42
src/mesa/drivers/dri/i965/brw_fs.h | 1 +
2 files changed, 43 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dr
instructions in affected programs: 474 -> 462 (-2.53%)
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c0af6d0..453683c 100644
--- a/src/mesa/dr
Will be used to simplify the handling of large virtual GRFs in SSA form.
---
src/mesa/drivers/dri/i965/brw_defines.h| 2 ++
src/mesa/drivers/dri/i965/brw_fs.cpp | 10 ++
src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
src/mesa/drivers/dri/i965/brw_fs_generator.cp
Since CSE creates instructions, if we let CSE generate things register
coalescing can't remove, bad things will happen. Only let CSE combine
non-copy load_payloads.
E.g., allow CSE to handle this
load_payload vgrf4+0, vgrf5, vgrf6
but not this
load_payload vgrf4+0, vgrf5+0, vgrf5+1
---
s
Helps Unigine Tropics and some (old) gstreamer shaders in shader-db.
instructions in affected programs: 792 -> 744 (-6.06%)
---
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
b
So that we don't have partial writes to a large VGRF. Will be cleaned up
by register coalescing.
---
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri
---
.../drivers/dri/i965/brw_fs_register_coalesce.cpp | 59 ++
1 file changed, 49 insertions(+), 10 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index a0aa169..0aa4b3e 100644
--- a/s
---
Allocating fewer sources than 3 is not handled in this series.
src/mesa/drivers/dri/i965/brw_fs.cpp | 8
src/mesa/drivers/dri/i965/brw_fs.h | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +
src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f926d97..1f174d3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
---
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 94f657d..e40567f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
++
From: Ian Romanick
Most of the overhead of the name allocation is the ralloc tracking,
especially on 64-bit. The allocation of the variable name "i" is 2
bytes for the name and 40 bytes for the ralloc tracking.
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 225KiB o
From: Ian Romanick
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.
Signed-off-by: Ian Romanick
---
src/glsl/ir.h | 5 +++--
1 file changed, 3 insertions(+), 2 dele
From: Ian Romanick
At least one of these pointers must be NULL, and we can determine which
will be NULL by looking at other fields. Use this information to store
both pointers in the same location.
If anyone can think of a better name for the union than "u", I'm all
ears.
Reduces the peak ir_v
From: Ian Romanick
Also move the new warn_extension_index into ir_variable::data. This
enables slightly better packing.
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5955672 1439077 7394749
After: I
From: Ian Romanick
warn_extension_index was moved to improve packing.
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.
Before: IR MEM: variable usage / name / t
From: Ian Romanick
All of the GL image enums fit in 16-bits.
Also move the fields from the anonymous "image" structucture to the next
higher structure. This will enable packing the bits with the other
bitfield.
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB
From: Ian Romanick
The payoff for this will come in a few more patches.
Signed-off-by: Ian Romanick
---
src/glsl/ast_array_index.cpp | 10 --
src/glsl/ir.h| 37 -
src/glsl/ir_validate.cpp | 9 +++--
src/glsl/link_functions.cp
From: Ian Romanick
This has the added perk that if you forget to set ir_type in the
constructor of a new subclass (or a new constructor of an existing
subclass) the compiler will tell you... instead of relying on
ir_validate or similar run-time detection.
Signed-off-by: Ian Romanick
---
src/gl
From: Ian Romanick
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 66KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5746368 1208630 6954998
After: IR MEM: variable usage / name / total: 5746368 1140817 6887185
Reduces the peak ir_variable memory usage
From: Ian Romanick
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 181KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5327760 1121441 6449201
After: IR MEM: variable usage / name / total: 5327760 935234 6262994
Reduces the peak ir_variable memory usage
From: Ian Romanick
The only values allowed are 0 and 1, and the value is checked before
assigning.
With the previous changes, reduces the peak ir_variable memory usage in
a trimmed apitrace of dota2 by 204KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 6374280 1439077 7813357
Afte
From: Ian Romanick
Currently this is done at each call to glLinkProgram. This seems like
as good a place as any. This is the main place where memory usage will
change, and it enables tracking as applications progress (e.g., load new
levels).
Signed-off-by: Ian Romanick
---
src/mesa/main/shad
From: Ian Romanick
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 18KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5746368 1140817 6887185
After: IR MEM: variable usage / name / total: 5746368 1121441 6867809
Reduces the peak ir_variable memory usage
From: Ian Romanick
The payoff for this will come in the next patch.
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.
Signed-off-by: Ian Romanick
---
src/glsl/built
From: Ian Romanick
v2: Also account for the ralloc header overhead.
Signed-off-by: Ian Romanick
---
src/glsl/Makefile.sources| 1 +
src/glsl/ir_memory_usage.cpp | 104 +++
src/glsl/ir_memory_usage.h | 48
3 files changed, 15
From: Ian Romanick
In the next patch, the type of ir_type is going to change from enum to
uint8_t. Since the type won't be an enum, we won't get compiler
warnings about, for example, switch statements that don't have cases for
all the enum values. Using a getter that returns the enum type will
From: Ian Romanick
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.
Signed-off-by: Ian Romanick
---
src/glsl/ir.h | 14 ++
1 file changed, 14 insertions
From: Ian Romanick
Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).
No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.
Reduces the peak ir_variable memory usage in a trim
From: Ian Romanick
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 39KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5327760 935234 6262994
After: IR MEM: variable usage / name / total: 5327760 894914 6222674
Reduces the peak ir_variable memory usage i
From: Ian Romanick
Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.
Before: IR MEM: variable usage / name / total: 5537064 1121441 6658505
After: IR MEM: variable usage / name / total: 5327760 1121441 6449201
Reduces the peak ir_variable memory usag
From: Ian Romanick
Also move num_state_slots inside ir_variable_data for better packing.
The payoff for this will come in a few more patches.
Signed-off-by: Ian Romanick
---
src/glsl/builtin_variables.cpp | 5 +--
src/glsl/ir.h | 56 ++
This series reduces the memory usage of ir_variable quite significantly.
The first couple patches add a mechanism to determine the amount of
memory used by any kind of IR object. This is used to collect the data
that is shown in the commit messages through the series.
Most of the rest of the pat
On Tue, May 27, 2014 at 7:49 PM, Ian Romanick wrote:
> From: Ian Romanick
>
> No change in the peak ir_variable memory usage in a trimmed apitrace of
> dota2 on 64-bit.
>
> No change in the peak ir_variable memory usage in a trimmed apitrace of
> dota2 on 32-bit.
>
> Signed-off-by: Ian Romanick
This breaks the build for me, see below. That's an out-of-tree build FWIW.
make[2]: Entering directory
'/home/daenzer/src/mesa-git/mesa/build-amd64/src/loader'
cd ../../.. && automake-1.14 --foreign src/loader/Makefile
src/loader/Makefile.am:42: warning: source file
'$(top_srcdir)/src/mesa/dr
Patches 1-8 inclusive are
Reviewed-by: Chris Forbes
On Wed, May 28, 2014 at 1:47 PM, Matt Turner wrote:
> Will get more complicated when fs_reg src becomes a pointer.
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +
> src/mesa/drivers/dri/i965/brw_fs.h | 1 +
> 2 files changed, 6 inse
1 - 100 of 105 matches
Mail list logo