x27;t hold off RCU here so what guarantees a racing global
shrinker walk doesn't trip over this shrinker_put() call dropping
the refcount to zero and freeing occuring in a different context...
> + /*
> + * We have already exited the read-side of rcu critical section
> + * before calling do_shrink_slab(), the shrinker_info may be
> + * released in expand_one_shrinker_info(), so reacquire the
> + * shrinker_info.
> + */
> + index++;
> + goto again;
With that, what makes the use of shrinker_info in
xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid?
-Dave.
--
Dave Chinner
da...@fromorbit.com
f (!shrinker)
> return;
>
> + if (shrinker->flags & SHRINKER_REGISTERED) {
> + shrinker_put(shrinker);
> + wait_for_completion(&shrinker->done);
> + }
Needs a comment explaining why we need to wait here...
> +
> down_write(&shrinker_rwsem);
urn 0;
> @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask,
> int nid,
> if (unlikely(!info))
> goto unlock;
>
> - for_each_set_bit(i, info->map, info->map_nr_max) {
> - struct shrink_control sc = {
> - .gfp_mask = gfp_mask,
> - .nid = nid,
> - .memcg = memcg,
> - };
> - struct shrinker *shrinker;
> + for (; index < shriner_id_to_index(info->map_nr_max); index++) {
> + struct shrinker_info_unit *unit;
This adds another layer of indent to shrink_slab_memcg(). Please
factor it first so that the code ends up being readable. Doing that
first as a separate patch will also make the actual algorithm
changes in this patch be much more obvious - this huge hunk of
diff is pretty much impossible to review...
-Dave.
--
Dave Chinner
da...@fromorbit.com
lowing cases.
> This commit uses the refcount+RCU method [5] proposed by Dave Chinner
> to re-implement the lockless global slab shrink. The memcg slab shrink is
> handled in the subsequent patch.
> ---
> include/linux/shrinker.h | 17 ++
>
e "this is obviously correct" that what we have
now.
> So not adding that super simple
> helper is not exactly the best choice in my opinion.
Each to their own - I much prefer the existing style/API over having
to go look up a helper function every time I want to check some
random shrinker has been set up correctly
-Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Jul 26, 2023 at 05:14:09PM +0800, Qi Zheng wrote:
> On 2023/7/26 16:08, Dave Chinner wrote:
> > On Mon, Jul 24, 2023 at 05:43:51PM +0800, Qi Zheng wrote:
> > > @@ -122,6 +126,13 @@ void shrinker_free_non_registered(struct shrinker
> > > *shrinker);
> >
gt; We used to implement the lockless slab shrink with SRCU [2], but then
> kernel test robot reported -88.8% regression in
> stress-ng.ramfs.ops_per_sec test case [3], so we reverted it [4].
>
> This commit uses the refcount+RCU method [5] proposed by Dave Chinner
> to re-implement th
RE)
unregister_memcg_shrinker(shrinker);
up_write(&shrinker_rwsem);
if (debugfs_entry)
shrinker_debugfs_remove(debugfs_entry, debugfs_id);
kfree(shrinker->nr_deferred);
kfree(shrinker);
}
EXPORT_SYMBOL_GPL(shrinker_free);
--
Dave Chinner
da...@fromorbit.com
On Fri, Jun 23, 2023 at 09:10:57PM +0800, Qi Zheng wrote:
> On 2023/6/23 14:29, Dave Chinner wrote:
> > On Thu, Jun 22, 2023 at 05:12:02PM +0200, Vlastimil Babka wrote:
> > > On 6/22/23 10:53, Qi Zheng wrote:
> > Yes, I suggested the IDR route because radix tree lo
is safe in. But for me to work that out from
first principles? I just don't have the time to do that right now.
> IIUC this is why Dave in [4] suggests unifying shrink_slab() with
> shrink_slab_memcg(), as the latter doesn't iterate the list but uses IDR.
Yes, I suggested the IDR route because radix tree lookups under RCU
with reference counted objects are a known safe pattern that we can
easily confirm is correct or not. Hence I suggested the unification
+ IDR route because it makes the life of reviewers so, so much
easier...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
patch set results in...
The other advantage of this is that it will break all the existing
out of tree code and third party modules using the old API and will
no longer work with a kernel using lockless slab shrinkers. They
need to break (both at the source and binary levels) to stop bad
things from happening due to using uncoverted shrinkers in the new
setup.
-Dave.
--
Dave Chinner
da...@fromorbit.com
lt; 31)/* do not map the buffer */
This doesn't make a whole lotta sense to me. It's blown up in a
tracepoint macro in XFS that was not changed at all in 5.18-rc1, nor
was any of the surrounding XFS code or contexts. Perhaps something
outside XFS changed to cause this on these platforms?
Can you bisect this, please?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Sep 01, 2021 at 07:07:34PM -0400, Felix Kuehling wrote:
> On 2021-09-01 6:03 p.m., Dave Chinner wrote:
> > On Wed, Sep 01, 2021 at 11:40:43AM -0400, Felix Kuehling wrote:
> > > Am 2021-09-01 um 4:29 a.m. schrieb Christoph Hellwig:
> > > > On Mon, Aug 30, 2
nefield that is trivial to get wrong iand very
difficult to fix - just look at the historical mess that RDMA
to/from file backed and/or DAX pages has been.
So, really, from my perspective as a filesystem engineer, I want to
see an actual specification for how this new memory type is going to
interact with filesystem and the page cache so everyone has some
idea of how this is going to work and can point out how it doesn't
work before code that simply doesn't work is pushed out into
production systems and then merged
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
c struct page *dax_busy_page(void *entry)
> for_each_mapped_pfn(entry, pfn) {
> struct page *page = pfn_to_page(pfn);
>
> - if (page_ref_count(page) > 1)
> + if (!dax_layout_is_idle_page(page))
Here too.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
GFP_FS allocations for now */
> > - if (!(gfp_mask & __GFP_FS))
> > - return false;
> > -
> > if (gfp_mask & __GFP_NOLOCKDEP)
> > return false;
> >
> > @@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
&
ork
> properly for these modified functions.
>
> Miscellanea:
>
> o Remove extra trailing ; and blank line from xfs_agf_verify
>
> Signed-off-by: Joe Perches
> ---
XFS bits look fine.
Acked-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
___
d have been, resulting in tons of dmesg noise like
>
> shrink_slab: i915_gem_inactive_scan+0x0/0x9c negative objects to delete
> nr=-xxxxx
>
> Fix discusssed with Dave Chinner.
Acked-by: Dave Chinner
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
On Thu, Sep 19, 2013 at 08:57:04AM +0200, Daniel Vetter wrote:
> On Wed, Sep 18, 2013 at 10:38 PM, Dave Chinner wrote:
> > No, that's wrong. ->count_objects should never ass SHRINK_STOP.
> > Indeed, it should always return a count of objects in the cache,
> &
[my keyboard my be on the fritz - it's not typing what I'm thinking...]
On Thu, Sep 19, 2013 at 06:38:22AM +1000, Dave Chinner wrote:
> On Wed, Sep 18, 2013 at 12:38:23PM +0200, Knut Petersen wrote:
> > On 18.09.2013 11:10, Daniel Vetter wrote:
> >
> > Just now
all in a different context.
However, if ->count-objects doesn't return a count, the work that
was supposed to be done cannot be deferred, and that is what
->count_objects should always return the number of objects in the
cache.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > Adding dri-devel and a few others because an i915 patch contributed to
&g
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > Adding dri-devel and a few others because an i915 patch contributed to
&g
relatively trivial storage setup without the above commit:
[142296.384921] flush-253:4 used greatest stack depth: 360 bytes left
Fundamentally, 8k stacks on x86-64 are too small for our
increasingly complex storage layers and the 100+ function deep call
chains that occur.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
relatively trivial storage setup without the above commit:
[142296.384921] flush-253:4 used greatest stack depth: 360 bytes left
Fundamentally, 8k stacks on x86-64 are too small for our
increasingly complex storage layers and the 100+ function deep call
chains that occur.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
s full of bugs. Rather that telling people to "update the
> > documentation" because it's too complex, how about we fix the
> > interface and the bugs?
>
> If you can take your time for that, I don't hesitate to help you. ;)
Its on my list of things to do i
s full of bugs. Rather that telling people to "update the
> > documentation" because it's too complex, how about we fix the
> > interface and the bugs?
>
> If you can take your time for that, I don't hesitate to help you. ;)
Its on my list of things to do if n
every CPU.
Spinlocks really, really hurt when that happens, and performance
will plummet when it happens because you burn CPU on locks rather
than reclaimıng objects. Single threaded object reclaim is still the
fastest way to do reclaim if you have global lists and locks.
What I'm trying to say
objects to reclaim them? After all, that's
exactly the interface I'm exposing to filesystems underneath the
shrinker API in the per-sb shrinker patchset that gets rid of
shrink_icache_memory() rather than propagating the insanity
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
view
because your shrinker can be called simultaneously on every CPU.
Spinlocks really, really hurt when that happens, and performance
will plummet when it happens because you burn CPU on locks rather
than reclaim?ng objects. Single threaded object reclaim is still the
fastest way to do reclaim if you have global lists and locks.
What I'm trying to say is that how you solve the shrinker balance
problem for you subsystem will be specific to how you need to hold
pages under memory pressure to maintain performance. Sorry I can't
give you a better answer than that, but that's what my experience
with caches and tuning shrinker behaviour indicates.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
objects to reclaim them? After all, that's
exactly the interface I'm exposing to filesystems underneath the
shrinker API in the per-sb shrinker patchset that gets rid of
shrink_icache_memory() rather than propagating the insanity
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
31 matches
Mail list logo