Reviewed-by: Edward O'Callaghan
On 10/08/2016 04:05 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> Not sure if it's possible to avoid programming the block size twice (once for
> the userdata and once for the dispatch).
> ---
> docs/features.txt | 2 +-
> doc
This series is,
Reviewed-by: Edward O'Callaghan
On 10/08/2016 03:04 AM, Nicolai Hähnle wrote:
> Hi,
>
> just some random small nice-to-have patches. The first one in
> particular is helpful, but not strictly necessary, with enhanced
> layouts. Please review!
>
> Thanks,
> Nicolai
> --
> .../dr
On 10/08/2016 06:55 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> In some cases, a shader may have an input/output array but not use some
> entries in the middle. This happens with eON games, for example.
>
> We emit declarations that cover the entire array range even if there are
> som
Patches 1-9 & 13 are,
Reviewed-by: Edward O'Callaghan
The rest is too academic for me.. :)
On 10/08/2016 06:55 AM, Nicolai Hähnle wrote:
> Hi everybody,
>
> this series implements the missing piece of ARB_enhanced_layouts, which is
> the component numbers part of the extension. You can find th
This doesn't seem to be enough to get glxinfo to report GL 4.5 on my
Tonga or Kabini systems
Even with MESA_GLSL_VERSION_OVERRIDE=440 or 450 the maximum reported
GLSL version is 430, the override works with 420 however
I'm wondering if there's something in the Gallium code that's limiting
it to a
From: Marek Olšák
---
src/util/ralloc.c | 355 ++
src/util/ralloc.h | 84 -
2 files changed, 435 insertions(+), 4 deletions(-)
diff --git a/src/util/ralloc.c b/src/util/ralloc.c
index bf21ac3..a8e660f 100644
--- a/src/util/ralloc.
From: Marek Olšák
No change in behavior. ralloc_size is equivalent to rzalloc_size.
That will change though.
Calls not switched to rzalloc_size:
- ralloc_vasprintf
- glsl_type::name allocation (it's filled with snprintf)
- C++ classes where valgrind didn't show uninitialized values
I switched m
From: Marek Olšák
only do it in rzalloc_size as it was supposed to be
---
src/util/ralloc.c | 26 +++---
1 file changed, 11 insertions(+), 15 deletions(-)
diff --git a/src/util/ralloc.c b/src/util/ralloc.c
index 23c89e5..bf21ac3 100644
--- a/src/util/ralloc.c
+++ b/src/util/
From: Marek Olšák
---
src/compiler/glsl/glsl_lexer.ll | 16
src/compiler/glsl/glsl_parser_extras.cpp | 2 ++
src/compiler/glsl/glsl_parser_extras.h | 2 ++
3 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler
From: Marek Olšák
time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null
Before (2 takes):
real0m8.734s0m8.773s
user0m34.232s 0m34.348s
sys 0m0.084s0m0.056s
After (2 takes):
real0m8.448s0m8.463s
user0m33.104s 0m33.160s
sys 0m0.088s0m0
From: Marek Olšák
don't rely on ralloc doing memset
---
src/compiler/glsl_types.cpp | 38 ++
src/compiler/glsl_types.h | 6 --
2 files changed, 6 insertions(+), 38 deletions(-)
diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
inde
From: Marek Olšák
---
src/compiler/glsl/opt_copy_propagation.cpp | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/compiler/glsl/opt_copy_propagation.cpp
b/src/compiler/glsl/opt_copy_propagation.cpp
index 02628cd..247c498 100644
--- a/src/compiler/glsl/opt_copy_propa
From: Marek Olšák
---
src/compiler/glsl/ast.h | 4 +-
src/compiler/glsl/ast_type.cpp | 13 +-
src/compiler/glsl/glsl_parser.yy | 202 +++
src/compiler/glsl/glsl_parser_extras.cpp | 4 +-
src/compiler/glsl/glsl_symbol_table.cpp
From: Marek Olšák
---
src/compiler/glsl/opt_copy_propagation_elements.cpp | 19 +--
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp
b/src/compiler/glsl/opt_copy_propagation_elements.cpp
index e4237cc..2bbe93d 100
From: Marek Olšák
---
src/compiler/glsl/opt_constant_propagation.cpp | 14 +++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/src/compiler/glsl/opt_constant_propagation.cpp
b/src/compiler/glsl/opt_constant_propagation.cpp
index 69bca74..4039512 100644
--- a/src/compil
From: Marek Olšák
---
src/compiler/glsl/ir.cpp| 4
src/compiler/glsl/ir.h | 13 -
src/compiler/glsl/lower_packed_varyings.cpp | 8 ++--
3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/src/compiler/glsl/ir.cpp b/src/c
From: Marek Olšák
no ralloc_free occurences
---
src/compiler/glsl/glsl_symbol_table.cpp | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/compiler/glsl/glsl_symbol_table.cpp
b/src/compiler/glsl/glsl_symbol_table.cpp
index 0b21faa..e9d57e8 100644
--- a/src/
From: Marek Olšák
---
src/compiler/glsl/opt_dead_code_local.cpp | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/src/compiler/glsl/opt_dead_code_local.cpp
b/src/compiler/glsl/opt_dead_code_local.cpp
index d38fd2b..fc979af 100644
--- a/src/compiler/glsl/opt_dead_c
From: Marek Olšák
---
src/compiler/glsl/glcpp/glcpp-lex.l | 2 +-
src/compiler/glsl/glcpp/glcpp-parse.y | 203 +++---
src/compiler/glsl/glcpp/glcpp.h | 1 +
3 files changed, 94 insertions(+), 112 deletions(-)
diff --git a/src/compiler/glsl/glcpp/glcpp-lex
From: Marek Olšák
---
src/util/ralloc.h | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/util/ralloc.h b/src/util/ralloc.h
index 7587e11..d74a398 100644
--- a/src/util/ralloc.h
+++ b/src/util/ralloc.h
@@ -414,39 +414,44 @@ bool ralloc_vasprintf_append(char **str,
Hi,
This patch series reduces the number of malloc calls in the GLSL
compiler by 63%. That leads to better compile times and less heap
thrashing.
It's done by switching memory allocations in the GLSL compiler to my
new linear allocator that allocates out of a fixed-sized buffer with
a monotonical
https://bugs.freedesktop.org/show_bug.cgi?id=98163
Bug ID: 98163
Summary: [PATCH] glx: usability: *must* also log issue context
("failed to open drm device").
Product: Mesa
Version: git
Hardware: x86 (IA32)
Reviewed-by: Marek Olšák
Marek
On Thu, Oct 6, 2016 at 7:51 PM, Axel Davy wrote:
> On systems with more than 4GB of ram,
> os_get_total_physical_memory was triggering an integer
> overflow for the linux and haiku path, when on
> 32 bits.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id
Ignore me I don't think patch 12 applied correctly
On Sat, 8 Oct 2016, 9:14 am Mike Lothian, wrote:
> This doesn't seem to be enough to get glxinfo to report GL 4.5 on my
> Tonga or Kabini systems
>
> Even with MESA_GLSL_VERSION_OVERRIDE=440 or 450 the maximum reported
> GLSL version is 430, the
Patches 1-5 are,
Reviewed-by: Edward O'Callaghan
I think it would be reassuring if you could run a before/after complete
piglit run also though if you have not already?
On 10/08/2016 09:58 PM, Marek Olšák wrote:
> Hi,
>
> This patch series reduces the number of malloc calls in the GLSL
> compil
On Sat, Oct 8, 2016 at 2:48 PM, Edward O'Callaghan
wrote:
> Patches 1-5 are,
> Reviewed-by: Edward O'Callaghan
>
> I think it would be reassuring if you could run a before/after complete
> piglit run also though if you have not already?
The series was tested with piglit and GL CTS. The testing o
On 10/09/2016 12:14 AM, Marek Olšák wrote:
> On Sat, Oct 8, 2016 at 2:48 PM, Edward O'Callaghan
> wrote:
>> Patches 1-5 are,
>> Reviewed-by: Edward O'Callaghan
>>
>> I think it would be reassuring if you could run a before/after complete
>> piglit run also though if you have not already?
>
> T
For the series:
Reviewed-by: Marek Olšák
Marek
On Fri, Oct 7, 2016 at 6:04 PM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> ---
> src/gallium/drivers/radeonsi/si_pipe.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
> b
Reviewed-by: Marek Olšák
Marek
On Fri, Oct 7, 2016 at 7:05 PM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> Not sure if it's possible to avoid programming the block size twice (once for
> the userdata and once for the dispatch).
> ---
> docs/features.txt | 2 +
we might want to add more folding passes here, so make it a bit more generic
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 124 ++---
1 file changed, 62 insertions(+), 62 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir
just little random noise in shader-db
will help in the next patch
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
b/src/gallium/drivers/no
changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=6 /width=1024 /height=640:
score: 1026 -> 1046
changes for shader-db:
total instructions in shared programs : 2752260 -> 2742049 (-0.37%)
total gprs used in shared programs: 380557 -> 380557 (0
This series reworks the structure of the pass to make it easier to add
more optimisations to it.
Also implements folding for mad on gf100+ ISAs to reduce instruction count
by ~0.37%
I can only test it on a gk106 for now.
Karol Herbst (6):
nv50/ir: add LIMM form of mad to gk110
nv50/ir: add L
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 49 ++
1 file changed, 32 insertions(+), 17 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +-
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++-
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
b/src/galliu
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 --
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
i
Reviewed-by: Edward O'Callaghan
On 10/07/2016 04:59 AM, Axel Davy wrote:
> Gallium nine relies on aliasing to work with this function.
> Without this patch, dirty region tracking was incorrect, which
> could lead to incorrect textures or vertex buffers.
> Fixes several game bugs with nine.
> Fixe
After this change each driver can request LLVM with a specific version
and specific targets/components anywhere.
For gallium "--enable-gallium-llvm" is only needed if at least one
driver calls "gallium_require_llvm()".
If the flag is set to auto it will default to no now if no driver with
"gallium
FYI, we use ralloc for a lot more than just the glsl compiler so the first
few changes make me a bit nervous. There was someone working on making our
driver more I undefined-memory-friendly but I don't know what happened to
those patches.
On Oct 8, 2016 3:58 AM, "Marek Olšák" wrote:
Hi,
This p
Hi,
I just skimmed through the list of supported extensions by
both nvc0 and radeonsi on mesamatrix and the number
at the top of the page should be equal for them.
nvc0 supports GL_ARB_compute_variable_group_size, radeonsi doesn't.
radeonsi supports GL_ARB_shader_stencil_export, nvc0 doesn't.
Ev
You're probably overlooking GL_ARB_shader_group_vote.
Boszormenyi Zoltan wrote on 08.10.2016 17:53:
> Hi,
>
> I just skimmed through the list of supported extensions by
> both nvc0 and radeonsi on mesamatrix and the number
> at the top of the page should be equal for them.
>
> nvc0 supports GL_A
Usually we prefix with gm107/ir, gk110/ir, etc...
More comments below.
On 10/08/2016 05:43 PM, Karol Herbst wrote:
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 --
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/src
Signed-off-by: Jason Ekstrand
---
src/intel/vulkan/anv_pipeline.c | 127 +---
src/intel/vulkan/anv_private.h | 4 --
2 files changed, 54 insertions(+), 77 deletions(-)
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 4b5c
Now that we don't have meta, we have no need for a gen-agnostic pipeline
create path. We can, instead, just generate one Create*Pipelines function
per gen and be done with it.
Signed-off-by: Jason Ekstrand
---
src/intel/vulkan/anv_genX.h | 7 ---
src/intel/vulkan/anv_pipeline.c | 106 --
Now that meta is gone and we're using blorp, we don't need all of the usage
hacks. Instead, the usage provided by the app is exactly the usage that we
want because the app is the only thing creating image views.
Signed-off-by: Jason Ekstrand
---
src/intel/vulkan/anv_image.c | 122 +---
Without meta, we no longer need the _init helpers and the ability to back
an image view with surface states allocated out of the command buffer.
Signed-off-by: Jason Ekstrand
---
src/intel/vulkan/anv_image.c | 90 --
src/intel/vulkan/anv_private.h | 10 -
Signed-off-by: Jason Ekstrand
---
src/intel/vulkan/genX_pipeline_util.h | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/src/intel/vulkan/genX_pipeline_util.h
b/src/intel/vulkan/genX_pipeline_util.h
index b390c5c..c518cfd 100644
--- a/src/intel/vulkan/genX_pipeline_util.h
On Sat, Oct 8, 2016 at 5:58 PM, Jason Ekstrand wrote:
> FYI, we use ralloc for a lot more than just the glsl compiler so the first
> few changes make me a bit nervous. There was someone working on making our
> driver more I undefined-memory-friendly but I don't know what happened to
> those patch
On 10/08/2016 02:12 AM, Ian Romanick wrote:
> From: Ian Romanick
>
> This was found partially by inspection and partially by hitting a
> problem while working on nir_op_pack_int64_2x32_split. The code
> previously would 'continue' if (instr->src[i].src.is_ssa), but the code
> immediately followi
"rework" is not the right term in my opinion. :)
On 10/08/2016 05:43 PM, Karol Herbst wrote:
we might want to add more folding passes here, so make it a bit more generic
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 124 ++---
1 file chan
On 10/08/2016 05:43 PM, Karol Herbst wrote:
just little random noise in shader-db
Like what? Please elaborate.
will help in the next patch
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --g
On 10/08/2016 05:43 PM, Karol Herbst wrote:
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +-
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++-
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/src/gallium
On Sat, Oct 8, 2016 at 9:30 AM, Marek Olšák wrote:
> On Sat, Oct 8, 2016 at 5:58 PM, Jason Ekstrand
> wrote:
> > FYI, we use ralloc for a lot more than just the glsl compiler so the
> first
> > few changes make me a bit nervous. There was someone working on making
> our
> > driver more I undefi
Please, update the prefix.
Also the same comment applies here, and I think the best way is to
enable that PostRAConstantFoldingPass for nvc0+ in a separate patch at
the end of that series. That way you won't break things and mupuf will
appreciate. :)
On 10/08/2016 05:43 PM, Karol Herbst wrot
This breaks a bunch of things, like:
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec2-using-if: fail
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec3-using-if: fail
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec4-using-if: fail
spec/glsl-4.30/execution/built-in-functions/
On Oct 8, 2016 10:36 AM, "Michael Schellenberger Costa" <
mschellenbergerco...@googlemail.com> wrote:
>
> Hi Jason,
>
>
> is there a reason that the first counter "i" is declared outside of the
loop?
Not really. I just moved the code that was there before. I have no idea
why it was written that w
2016-10-08 18:54 GMT+02:00 Samuel Pitoiset :
> Please, update the prefix.
>
> Also the same comment applies here, and I think the best way is to enable
> that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of
> that series. That way you won't break things and mupuf will apprecia
On 10/08/2016 07:59 PM, Karol Herbst wrote:
2016-10-08 18:54 GMT+02:00 Samuel Pitoiset :
Please, update the prefix.
Also the same comment applies here, and I think the best way is to enable
that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of
that series. That way you wo
On Oct 8, 2016 10:52 AM, "Jason Ekstrand" wrote:
>
> On Oct 8, 2016 10:36 AM, "Michael Schellenberger Costa" <
mschellenbergerco...@googlemail.com> wrote:
> >
> > Hi Jason,
> >
> >
> > is there a reason that the first counter "i" is declared outside of the
loop?
>
> Not really. I just moved the co
the emit code uses 19 everywhere, so we should let
CodeEmitterGM107::longIMMD and TargetNVC0::insnCanLoad check against
this too
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 6 +++---
src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp |
Pretty sure that the float one is fine. And there's a 20th bit, it
just behaves differently than one might expect. I don't remember all
the details though...
On Sat, Oct 8, 2016 at 3:23 PM, Karol Herbst wrote:
> the emit code uses 19 everywhere, so we should let
> CodeEmitterGM107::longIMMD and T
2016-10-08 21:26 GMT+02:00 Ilia Mirkin :
> Pretty sure that the float one is fine. And there's a 20th bit, it
> just behaves differently than one might expect. I don't remember all
> the details though...
ohh I think you are right, just took a loot inside
CodeEmitterGM107::emitIMMD and it indeed d
On 10/08/2016 09:26 PM, Ilia Mirkin wrote:
Pretty sure that the float one is fine. And there's a 20th bit, it
just behaves differently than one might expect. I don't remember all
the details though...
Yep, the float one is correct. The 20th bit is the sign bit, which is
correctly emitted in
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs:335256 -> 335273 (0.01%)
total local used in shared programs :31968 -> 31968 (0.00%)
localgpr inst bytes
helped 0 41 852
looks great, a few comments below
2016-10-08 21:55 GMT+02:00 Samuel Pitoiset :
> total instructions in shared programs :2286901 -> 2284473 (-0.11%)
> total gprs used in shared programs:335256 -> 335273 (0.01%)
> total local used in shared programs :31968 -> 31968 (0.00%)
>
>
On Sat, Oct 8, 2016 at 3:55 PM, Samuel Pitoiset
wrote:
> total instructions in shared programs :2286901 -> 2284473 (-0.11%)
> total gprs used in shared programs:335256 -> 335273 (0.01%)
> total local used in shared programs :31968 -> 31968 (0.00%)
>
> localgpr i
Hi Marek
Series is
Tested-by: Edmondo Tommasina
I've merged your series of patches on top of mesa git master and
tested on a Radeon RX 470. No regressions found.
OpenGL renderer string: Gallium 0.4 on AMD POLARIS10 (DRM 3.3.0 /
4.8.0-rc6, LLVM 3.9.0)
OpenGL core profile version string: 4.3 (Cor
On Fri, Oct 7, 2016 at 3:55 PM, Nicolai Hähnle wrote:
> Hi everybody,
>
> this series implements the missing piece of ARB_enhanced_layouts, which is
> the component numbers part of the extension. You can find the full series
> here: https://cgit.freedesktop.org/~nh/mesa/log/?h=ARB_enhanced_layout
A function with multiple returns would have had multiple preret settings
at the top of the function. While this is unlikely to have caused issues
since we don't use funcitons in earnest, it could have in some cases
overflowed the call stack, in case a function had a lot of early
returns.
Signed-of
On Sat, Oct 8, 2016 at 8:04 PM, Ilia Mirkin wrote:
> On Fri, Oct 7, 2016 at 3:55 PM, Nicolai Hähnle wrote:
>> Hi everybody,
>>
>> this series implements the missing piece of ARB_enhanced_layouts, which is
>> the component numbers part of the extension. You can find the full series
>> here: https
Signed-off-by: Edward O'Callaghan
---
src/intel/vulkan/anv_meta.h | 5 -
src/intel/vulkan/anv_nir.h| 5 -
src/intel/vulkan/anv_private.h| 5 -
src/intel/vulkan/anv_wsi.h| 5 -
src/intel/vulkan/vk_format_info.h | 5 -
5 files changed, 20 insertions(+),
71 matches
Mail list logo