On Mon, 2013-12-09 at 10:56 -0500, Tom Stellard wrote: > On Sat, Dec 07, 2013 at 07:06:36PM +0400, Vadim Girlin wrote: > > On evergreen we have to reserve 1 stack element in some additional cases > > besides the ones mentioned in the docs, but stack size computation was > > recently reimplemented exactly as described in the docs by the patch that > > added workarounds for stack issues on EG/CM, resulting in regressions > > with some apps (Serious Sam 3). > > > > This patch fixes it by restoring previous behavior. > > > > Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369 > > > > Signed-off-by: Vadim Girlin <vadimgir...@gmail.com> > > Cc: "10.0" <mesa-sta...@lists.freedesktop.org> > > --- > > src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 16 ++++++++++++---- > > 1 file changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > > b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > > index bc71cf8..355eb63 100644 > > --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > > +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp > > @@ -770,7 +770,6 @@ void bc_finalizer::update_ngpr(unsigned gpr) { > > unsigned bc_finalizer::get_stack_depth(node *n, unsigned &loops, > > unsigned &ifs, unsigned add) { > > unsigned stack_elements = add; > > - bool has_non_wqm_push_with_loops_on_stack = false; > > bool has_non_wqm_push = (add != 0); > > region_node *r = n->is_region() ? > > static_cast<region_node*>(n) : n->get_parent_region(); > > @@ -781,8 +780,6 @@ unsigned bc_finalizer::get_stack_depth(node *n, > > unsigned &loops, > > while (r) { > > if (r->is_loop()) { > > ++loops; > > - if (has_non_wqm_push) > > - has_non_wqm_push_with_loops_on_stack = true; > > } else { > > ++ifs; > > has_non_wqm_push = true; > > @@ -795,15 +792,26 @@ unsigned bc_finalizer::get_stack_depth(node *n, > > unsigned &loops, > > switch (ctx.hw_class) { > > case HW_CLASS_R600: > > case HW_CLASS_R700: > > + // If any non-WQM push is invoked, 2 elements should be > > reserved. > > if (has_non_wqm_push) > > stack_elements += 2; > > break; > > case HW_CLASS_CAYMAN: > > + // If any stack operation is invoked, 2 elements should be > > reserved > > if (stack_elements) > > stack_elements += 2; > > break; > > case HW_CLASS_EVERGREEN: > > - if (has_non_wqm_push_with_loops_on_stack) > > + // According to the docs we need to reserve 1 element for each > > of the > > + // following cases: > > + // 1) non-WQM push is used with WQM/LOOP frames on stack > > + // 2) ALU_ELSE_AFTER is used at the point of max stack usage > > + // NOTE: > > + // It was found that the conditions above are not sufficient, > > there are > > + // other cases where we also need to reserve stack space, > > that's why > > + // we always reserve 1 stack element if we have non-WQM push on > > stack. > > + // Condition 2 is ignored for now because we don't use this > > instruction. > > + if (has_non_wqm_push) > > ++stack_elements; > > The kernel analyzer reports a stack size of 2 for compute shaders that > have 3 levels of ALU_PUSH_BEFORE. This would suggest that you either need to > reserve 2 sub-entries (stack_elements in the sb code) when there is a > non-wqm push, or apply the CAYMAN rules to EVERGREEN. > > It is possible, though, that the kernel analyzer is over-allocating and > this patch is correct, but I don't have any evidence for this yet.
Is there any test that fails with this patch? AFAIK this algorithm worked fine for about 8 months in both old and sb backends, so I'd rather prefer to have any evidence that this is not correct before increasing stack allocation and reducing performance. Vadim > > -Tom > > > > break; > > } > > -- > > 1.8.4.2 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev