On (04/26/17 15:08), Joonsoo Kim wrote:
> > > +struct zram_hash {
> > > + spinlock_t lock;
> > > + struct rb_root rb_root;
> > >  };
> > 
> > just a note.
> > 
> > we can easily have N CPUs spinning on ->lock for __zram_dedup_get() lookup,
> > which can invole a potentially slow zcomp_decompress() [zlib, for example,
> > with 64k pages] and memcmp(). the larger PAGE_SHIFT is, the more serialized
> > IOs become. in theory, at least.
> > 
> > CPU0                                CPU1            ...     CPUN
> > 
> > __zram_bvec_write() __zram_bvec_write()             __zram_bvec_write()
> >  zram_dedup_find()   zram_dedup_find()               zram_dedup_find()
> >   spin_lock(&hash->lock);
> >                       spin_lock(&hash->lock);         
> > spin_lock(&hash->lock);
> >    __zram_dedup_get()
> >     zcomp_decompress()
> >      ...
> > 
> > 
> > so may be there is a way to use read-write lock instead on spinlock for hash
> > and reduce write/read IO serialization.
> 
> In fact, dedup release hash->lock before doing zcomp_decompress(). So,
> above contention cannot happen.

oh, my bad. you are right. somehow I didn't spot the unlock.

        -ss

Reply via email to