On 12/06/2014 07:13 AM, Vadim Girlin wrote:
On 12/04/2014 01:43 AM, Dave Airlie wrote:
Hi Vadim,
I've been looking with Glenn's help into a bug in sb for a couple of
weeks now triggered by a change in how GLSL generates switch
statements.
I understand you probably aren't too interested in r600g but I believe
I'm hitting a design level problem and I would like some advice.
So it appears that GLSL can create loops that don't repeat for switch
statements, and it appears SB wasn't ready to handle such a thing.
Hi, Dave,
I suspect we should rather get rid of such loops somehow, i.e. convert
to something else, the loop that never repeats is not really a loop
anyway. AFAICS "continue" is not supported in switch statements
according to GLSL specs, so the loops generated for switch will never be
repeated. Am I missing something? Even if repeating is possible somehow,
at least we can get rid of the loops that are not repeated.
I think loops are less efficient than other control flow instructions on
r600g hw (at least because they increase stack usage), and possibly on
other hw too.
In fact it seems sb basically gets rid of it already in IR, it just
doesn't know how to translate resulting control flow to ISA, because so
far it only supports specific control flow structure for if-then-else
that was previously preserved during optimizations. I think it may be
not very hard to implement support for that in finalizer, I'll look into
it.
In fact handling that control flow in finalizer is not as easy as I
hoped, probably impossible, at least if we want to make it efficient. I
forgot about the limitations of R600 ISA.
OTOH it seems I've managed to fix the issues with loops, the patch is
attached (it's meant to be used instead of 7b0067d2). There are no
piglit regressions on evergreen, but I didn't test any real apps.
Vadim
sb has the ->is_loop() and it just checks !repeats.empty(), so this
meant in the finalizer code we'd fall into the if statement which
would then assert.
I hacked/fixed (more hacked), this in
7b0067d23a6f64cf83c42e7f11b2cd4100c569fe
which attempts to detect single pass loops and handle things that way.
However this lead to stack depth calculations being incorrectly done,
so I moved the single loop detect into the is_loop check, (see
attached patch).
This fixes the rendering in some places, but lead to a regression in
tests/shaders/glsl-vs-continue-in-switch-in-do-while.shader_test
error at : PHI t76||FP@R3.x, t128||FP@R3.x, t115||FP@R3.x,
t102||FP@R3.x, t89||FP@R3.x : expected
operand value t115||FP@R3.x, gpr contains t17||FP@R3.x
error at : PHI t76||FP@R3.x, t128||FP@R3.x, t115||FP@R3.x,
t102||FP@R3.x, t89||FP@R3.x : expected
operand value t102||FP@R3.x, gpr contains t17||FP@R3.x
Now Glenn suspected this was due to the is_loop check in
sb_shader.cpp:create_bbs,
and changing that check to only detect repeating loops removes that
issue,
but introduces stack sizing issues again, resulting in lockups/random
rendering.
So I just want to ask had you considered single loops with an always
break in sb design,
I didn't see such loops with any test cases, so I didn't even think
about it.
and perhaps some idea where things are going so wrong with the
register alloc above.
Not sure, but as long as the only "repeat" node is optimized away in
bc_parser because it's useless due to unconditional break, I suspect it
may be not easy to make all other code think that it's still a loop.
I've tried a quick fix to not optimize the repeat away for such loops,
but it results in other issues, probably it will require handling this
as a special case in other places, so it doesn't look like a good idea
either.
I'll try to implement the solution that I described above, that is,
translate resulting control flow back to ISA. If it won't be too much
work, it's probably the best way and it won't use loop instructions in
the end.
I suspect I'll keep digging into this, but its getting to the edges of
the brain space/time I can find!
Dave.
>From 4967ef90847f921fc0ef7c018ae7ae8048d2a6ce Mon Sep 17 00:00:00 2001
From: Vadim Girlin <vadimgir...@gmail.com>
Date: Mon, 8 Dec 2014 13:11:48 +0300
Subject: [PATCH] r600g/sb: fix issues with loops created for switch statements
---
src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 ++
src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 2 ++
src/gallium/drivers/r600/sb/sb_if_conversion.cpp | 4 ++--
src/gallium/drivers/r600/sb/sb_ir.h | 9 +++++++--
src/gallium/drivers/r600/sb/sb_sched.cpp | 2 +-
5 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp
index f0849ca..3f362c4 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp
@@ -110,6 +110,8 @@ int bc_finalizer::run() {
void bc_finalizer::finalize_loop(region_node* r) {
+ update_nstack(r);
+
cf_node *loop_start = sh.create_cf(CF_OP_LOOP_START_DX10);
cf_node *loop_end = sh.create_cf(CF_OP_LOOP_END);
diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
index d787e5b..403f938 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
@@ -758,6 +758,8 @@ int bc_parser::prepare_loop(cf_node* c) {
c->insert_before(reg);
rep->move(c, end->next);
+ reg->src_loop = true;
+
loop_stack.push(reg);
return 0;
}
diff --git a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
index 93edace..3f2b1b1 100644
--- a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
+++ b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
@@ -115,13 +115,13 @@ void if_conversion::convert_kill_instructions(region_node *r,
bool if_conversion::check_and_convert(region_node *r) {
depart_node *nd1 = static_cast<depart_node*>(r->first);
- if (!nd1->is_depart())
+ if (!nd1->is_depart() || nd1->target != r)
return false;
if_node *nif = static_cast<if_node*>(nd1->first);
if (!nif->is_if())
return false;
depart_node *nd2 = static_cast<depart_node*>(nif->first);
- if (!nd2->is_depart())
+ if (!nd2->is_depart() || nd2->target != r)
return false;
value* &em = nif->cond;
diff --git a/src/gallium/drivers/r600/sb/sb_ir.h b/src/gallium/drivers/r600/sb/sb_ir.h
index 85c3d06..711c2eb 100644
--- a/src/gallium/drivers/r600/sb/sb_ir.h
+++ b/src/gallium/drivers/r600/sb/sb_ir.h
@@ -1089,7 +1089,8 @@ typedef std::vector<repeat_node*> repeat_vec;
class region_node : public container_node {
protected:
region_node(unsigned id) : container_node(NT_REGION, NST_LIST), region_id(id),
- loop_phi(), phi(), vars_defined(), departs(), repeats() {}
+ loop_phi(), phi(), vars_defined(), departs(), repeats(), src_loop()
+ {}
public:
unsigned region_id;
@@ -1101,12 +1102,16 @@ public:
depart_vec departs;
repeat_vec repeats;
+ // true if region was created for loop in the parser, sometimes repeat_node
+ // may be optimized away so we need to remember this information
+ bool src_loop;
+
virtual bool accept(vpass &p, bool enter);
unsigned dep_count() { return departs.size(); }
unsigned rep_count() { return repeats.size() + 1; }
- bool is_loop() { return !repeats.empty(); }
+ bool is_loop() { return src_loop || !repeats.empty(); }
container_node* get_entry_code_location() {
node *p = first;
diff --git a/src/gallium/drivers/r600/sb/sb_sched.cpp b/src/gallium/drivers/r600/sb/sb_sched.cpp
index 1413916..3d62bcf 100644
--- a/src/gallium/drivers/r600/sb/sb_sched.cpp
+++ b/src/gallium/drivers/r600/sb/sb_sched.cpp
@@ -984,7 +984,7 @@ void post_scheduler::init_globals(val_set &s, bool prealloc) {
for (val_set::iterator I = s.begin(sh), E = s.end(sh); I != E; ++I) {
value *v = *I;
- if (v->is_sgpr() && !v->is_global()) {
+ if (v->is_sgpr()) {
v->set_global();
if (prealloc && v->is_fixed()) {
--
2.1.0
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev