On Sat, Dec 07, 2013 at 07:06:36PM +0400, Vadim Girlin wrote: > On evergreen we have to reserve 1 stack element in some additional cases > besides the ones mentioned in the docs, but stack size computation was > recently reimplemented exactly as described in the docs by the patch that > added workarounds for stack issues on EG/CM, resulting in regressions > with some apps (Serious Sam 3). > > This patch fixes it by restoring previous behavior. > > Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369 > > Signed-off-by: Vadim Girlin <vadimgir...@gmail.com> > Cc: "10.0" <mesa-sta...@lists.freedesktop.org> > --- > src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > index bc71cf8..355eb63 100644 > --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > @@ -770,7 +770,6 @@ void bc_finalizer::update_ngpr(unsigned gpr) { > unsigned bc_finalizer::get_stack_depth(node *n, unsigned &loops, > unsigned &ifs, unsigned add) { > unsigned stack_elements = add; > - bool has_non_wqm_push_with_loops_on_stack = false; > bool has_non_wqm_push = (add != 0); > region_node *r = n->is_region() ? > static_cast<region_node*>(n) : n->get_parent_region(); > @@ -781,8 +780,6 @@ unsigned bc_finalizer::get_stack_depth(node *n, unsigned > &loops, > while (r) { > if (r->is_loop()) { > ++loops; > - if (has_non_wqm_push) > - has_non_wqm_push_with_loops_on_stack = true; > } else { > ++ifs; > has_non_wqm_push = true; > @@ -795,15 +792,26 @@ unsigned bc_finalizer::get_stack_depth(node *n, > unsigned &loops, > switch (ctx.hw_class) { > case HW_CLASS_R600: > case HW_CLASS_R700: > + // If any non-WQM push is invoked, 2 elements should be > reserved. > if (has_non_wqm_push) > stack_elements += 2; > break; > case HW_CLASS_CAYMAN: > + // If any stack operation is invoked, 2 elements should be > reserved > if (stack_elements) > stack_elements += 2; > break; > case HW_CLASS_EVERGREEN: > - if (has_non_wqm_push_with_loops_on_stack) > + // According to the docs we need to reserve 1 element for each > of the > + // following cases: > + // 1) non-WQM push is used with WQM/LOOP frames on stack > + // 2) ALU_ELSE_AFTER is used at the point of max stack usage > + // NOTE: > + // It was found that the conditions above are not sufficient, > there are > + // other cases where we also need to reserve stack space, > that's why > + // we always reserve 1 stack element if we have non-WQM push on > stack. > + // Condition 2 is ignored for now because we don't use this > instruction. > + if (has_non_wqm_push) > ++stack_elements;
The kernel analyzer reports a stack size of 2 for compute shaders that have 3 levels of ALU_PUSH_BEFORE. This would suggest that you either need to reserve 2 sub-entries (stack_elements in the sb code) when there is a non-wqm push, or apply the CAYMAN rules to EVERGREEN. It is possible, though, that the kernel analyzer is over-allocating and this patch is correct, but I don't have any evidence for this yet. -Tom > break; > } > -- > 1.8.4.2 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev