On Tue, Sep 12, 2017 at 2:40 AM, Marathe, Yogesh <yogesh.mara...@intel.com> wrote:
> Hi Jason, > > > > On the asserts you’ve mentioned below, I assume we need to add them after > ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0 initially. > Yes, because otherwise bucket_for_size will return NULL all the time. > Another clarification on ~1%, we meant approx. 1% there, that’s an > improvement we saw in 3Dmark total not a degradation, we’ll correct it in > commit msg. > > > > Rest all review comments from you, Tapani and Emil are noted & > implemented, we are working on running it through mesa CI/CTS and we should > see a v2 for review after that. > > > > Regards, > > Yogesh. > > > > *From:* mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] *On > Behalf Of *Jason Ekstrand > *Sent:* Friday, September 8, 2017 9:09 PM > *To:* Muthukumar, Aravindan <aravindan.muthuku...@intel.com> > *Cc:* mesa-dev@lists.freedesktop.org; J Karanje, Kedar < > kedar.j.kara...@intel.com> > *Subject:* Re: [Mesa-dev] [PATCH] i965 : optimized bucket index > calculation > > > > In general, I'm very concerned about how this handles rounding behavior. > Almost everywhere, you round down when what you want to do is round up. > Also, as I said on IRC, I'd like to see some asserts in add_bucket so that > we are sure this calculation is correct. In particular, I'd like to see > > > > assert(bucket_for_size(size) == &bufmgr->cache_bucket[i]); > > assert(bucket_for_size(size - 2048) == &bufmgr->cache_bucket[i]); > > assert(bucket_for_size(size + 1) != &bufmgr->cache_bucket[i]); > > > > We need to check on both sides of size to be 100% sure we're doing our > rounding correctly. > > > > On Fri, Sep 8, 2017 at 1:11 AM, <aravindan.muthuku...@intel.com> wrote: > > From: Aravindan Muthukumar <aravindan.muthuku...@intel.com> > > Avoiding the loop which was running with O(n) complexity. > Now the complexity has been reduced to O(1) > > Tested with piglit. > Slight performance improvement (~1%) in 3d mark. > > > > Which 3dmark test? Also, what's the error in that 1%? > > > > Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6 > Signed-off-by: Aravindan Muthukumar <aravindan.muthuku...@intel.com> > Signed-off-by: Kedar Karanje <kedar.j.kara...@intel.com> > Reviewed-by: Yogesh Marathe <yogesh.mara...@intel.com> > --- > src/mesa/drivers/dri/i965/brw_bufmgr.c | 48 > +++++++++++++++++++++++++++++----- > 1 file changed, 41 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c > b/src/mesa/drivers/dri/i965/brw_bufmgr.c > index 5b4e784..18cb166 100644 > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c > @@ -87,6 +87,11 @@ > > #define memclear(s) memset(&s, 0, sizeof(s)) > > +/* Macros for BO cache size */ > +#define CACHE_PAGE_SIZE 4096 > > > > Just call this PAGE_SIZE > > > > +#define PAGE_SIZE_SHIFT 12 > +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE) > > > > I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this > extra #define. I think it's making things harder to read and not easier. > > > > + > #define FILE_DEBUG_FLAG DEBUG_BUFMGR > > static inline int > @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t > pitch, uint32_t tiling) > return ALIGN(pitch, tile_width); > } > > +/* > + * This functions is to find the correct bucket fit for the input size. > + * This function works with O(1) complexity when the requested size > + * was queried instead of iterating the size through all the buckets. > + */ > static struct bo_cache_bucket * > bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size) > { > - int i; > + struct bo_cache_bucket *bucket = NULL; > + int x=0,index = -1; > + int row, col=0; > > - for (i = 0; i < bufmgr->num_buckets; i++) { > - struct bo_cache_bucket *bucket = &bufmgr->cache_bucket[i]; > - if (bucket->size >= size) { > - return bucket; > - } > + /* condition for size less than 4*4096 (4KB) page size */ > + if(size < BO_CACHE_PAGE_SIZE){ > > > > This should be "<=" > > > > + index = (size>>PAGE_SIZE_SHIFT)+((size%(1<<PAGE_SIZE_SHIFT)?1:0) > )-1; > > > > I agree with tapani, that this can easily be an early return. > > > > I think we can also make this calculation a lot more clear: > > > > index = DIV_ROUND_UP(size, PAGE_SIZE) - 1; > > > > } > + else{ > + /* When the size is more than 4*4096, the logic follows a matrix > method > + * where the index will be searched using Arithmetico-Geometric > progression. > + * So the given size will be divided by 4096 & the index will be > traced out. > + */ > + x = size>>PAGE_SIZE_SHIFT; > > > > This rounds down not up like you want. > > > > - return NULL; > + /* Find the row using Geometric Progression. The highest bit set > will give > + * the row number. num = a * r^(n-1) where num = size a = 4 r = 2 > + */ > + row = 31 - __builtin_clz(x>>1); > + > + /* Find the column using AP but using the row value > + * calculated using GP. > + */ > + col =((x-(1<<(row+1)))/(1<<(row-1)))+1; > + col += (size%(1<<PAGE_SIZE_SHIFT<<(row-1)))?1:0; > + > + /* Finding the index value using calculated row and col number */ > + index = ((row-1)<<2) + col + 2; > > > > I think this can probably also be a lot simpler. How about something like > this: > > > > pages = DIV_ROUND_UP(size, PAGE_SIZE); > > /* Steal this from anv_allocator.c */ > > pages_log2 = ilog2_round_up(pages); > > row = pages_log2 - 1; > > col = DIV_ROUND_UP(pages, (1 << (pages_log2 - 2))); > > index = row * 4 + col; > > > > + } > + > + /* Checking the error condition */ > + bucket = (index >= 0 && index < bufmgr->num_buckets)?(&bufmgr- > >cache_bucket[index]):NULL; > + return bucket; > } > > int > -- > 2.7.4 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev