On Tue, 27 Oct 2020, Laurent Dufour wrote:
> The issue is that object is not NULL while page is NULL which is odd but
> may happen if the cache flush happened after loading object but before
> loading page. Thus checking for the page pointer is required too.
Ok then lets revert commit 6159d0f5c
On Wed, 14 Oct 2020, Kees Cook wrote:
> Note on patch 2: Christopher NAKed it, but I actually think this is a
> reasonable thing to add -- the "too small" check is only made when built
> with CONFIG_DEBUG_VM, so it *is* actually possible for someone to trip
> over this directly, even if it would n
On Mon, 12 Oct 2020, Xianting Tian wrote:
> In architecture like powerpc, we can have cpus without any local memory
> attached to it. In such cases the node does not have real memory.
>
> In many places of current kernel code, it doesn't judge whether the node is
> memoryless numa node before call
On Fri, 9 Oct 2020, Kees Cook wrote:
> Store the freelist pointer out of line when object_size is smaller than
> sizeof(void *) and redzoning is enabled.
>
> (Note that no caches with such a size are known to exist in the kernel
> currently.)
Ummm... The smallest allowable cache size is sizeof(vo
On Tue, 6 Oct 2020, Dave Hansen wrote:
> These zero checks are not great because it is not obvious what a zero
> mode *means* in the code. Replace them with a helper which makes it
> more obvious: node_reclaim_enabled().
Well it uselessly checks bits. But whatever. It will prevent future code
re
On Tue, 6 Oct 2020, Dave Hansen wrote:
> But, when the bit was removed (bit 0) the _other_ bit locations also
> got changed. That's not OK because the bit values are documented to
> mean one specific thing and users surely rely on them meaning that one
> thing and not changing from kernel to kern
On Tue, 6 Oct 2020, Dave Hansen wrote:
> It is currently not obvious that the RECLAIM_* bits are part of the
> uapi since they are defined in vmscan.c. Move them to a uapi header
> to make it obvious.
Acked-by: Christoph Lameter
On Tue, 6 Oct 2020, Matthew Wilcox wrote:
> On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> > It seems to me like, if you want to make UAF exploitation harder at
> > the heap allocator layer, you could do somewhat more effective things
> > with a probably much smaller performance b
On Mon, 5 Oct 2020, Kees Cook wrote:
> > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> > by a measurable amount, it wouldn't be a terribly hard sell ...
>
> Isn't the "easy" version of this already controlled by slab_merge? (i.e.
> do not share same-sized/flagged km
On Tue, 22 Sep 2020, David Brazdil wrote:
> Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
> hyp code self-contained and independent of the rest of the kernel.
The percpu subsystems point is to enable the use of special hardware
instructions that can perform address calculat
On Tue, 15 Sep 2020, Marco Elver wrote:
> void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
> {
> - void *ret = slab_alloc(s, gfpflags, _RET_IP_);
> + void *ret = slab_alloc(s, gfpflags, _RET_IP_, s->object_size);
The additional size parameter is a part of a struct kmem_cache
On Tue, 15 Sep 2020, Marco Elver wrote:
> @@ -3206,7 +3207,7 @@ static void *cache_alloc_node(struct kmem_cache
> *cachep, gfp_t flags,
> }
>
> static __always_inline void *
> -slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
> +slab_alloc_node(struct kmem_cache *cache
On Thu, 13 Aug 2020, wuyun...@huawei.com wrote:
> The two conditions are mutually exclusive and gcc compiler will
> optimise this into if-else-like pattern. Given that the majority
> of free_slowpath is free_frozen, let's provide some hint to the
> compilers.
Acked-by: Christoph Lameter
On Fri, 7 Aug 2020, Pekka Enberg wrote:
> Why do you consider this to be a fast path? This is all partial list
> accounting when we allocate/deallocate a slab, no? Just like
> ___slab_alloc() says, I assumed this to be the slow path... What am I
> missing?
I thought these were per object counters
On Fri, 7 Aug 2020, Pekka Enberg wrote:
> I think we can just default to the counters. After all, if I
> understood correctly, we're talking about up to 100 ms time period
> with IRQs disabled when count_partial() is called. As this is
> triggerable from user space, that's a performance bug whatev
On Tue, 7 Jul 2020, Pekka Enberg wrote:
> On Fri, Jul 3, 2020 at 12:38 PM xunlei wrote:
> >
> > On 2020/7/2 PM 7:59, Pekka Enberg wrote:
> > > On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang
> > > wrote:
> > >> The node list_lock in count_partial() spend long time iterating
> > >> in case of large
On Thu, 2 Jul 2020, Xunlei Pang wrote:
> This patch introduces two counters to maintain the actual number
> of partial objects dynamically instead of iterating the partial
> page lists with list_lock held.
>
> New counters of kmem_cache_node are: pfree_objects, ptotal_objects.
> The main operation
On Mon, 29 Jun 2020, Matthew Wilcox wrote:
> Sounds like we need a test somewhere that checks this behaviour.
>
> > In order to make such allocations possible one would have to create yet
> > another kmalloc array for high memory.
>
> Not for this case because it goes straight to kmalloc_order().
On Mon, 29 Jun 2020, Matthew Wilcox wrote:
> Slab used to disallow GFP_HIGHMEM allocations earlier than this,
It is still not allowed and not supported.
On Sat, 27 Jun 2020, Long Li wrote:
> Environment using the slub allocator, 1G memory in my ARM32.
> kmalloc(1024, GFP_HIGHUSER) can allocate memory normally,
> kmalloc(64*1024, GFP_HIGHUSER) will cause a memory leak, because
> alloc_pages returns highmem physical pages, but it cannot be directly
On Wed, 24 Jun 2020, Srikar Dronamraju wrote:
> Currently Linux kernel with CONFIG_NUMA on a system with multiple
> possible nodes, marks node 0 as online at boot. However in practice,
> there are systems which have node 0 as memoryless and cpuless.
Maybe add something to explain why you are not
On Tue, 16 Jun 2020, William Kucharski wrote:
> Other mm routines such as kfree() and kzfree() silently do the right
> thing if passed a NULL pointer, so ksize() should do the same.
Ok so the size of an no object pointer is zero? Ignoring the freeing
of a nonexisting object makes sense. But deter
On Sun, 14 Jun 2020, Muchun Song wrote:
> The slabs_node() always return zero when CONFIG_SLUB_DEBUG is disabled.
> But some codes determine whether slab is empty by checking the return
> value of slabs_node(). As you know, the result is not correct. we move
> the nr_slabs of kmem_cache_node out o
On Tue, 12 May 2020, Roman Gushchin wrote:
> > Add it to the metadata at the end of the object. Like the debugging
> > information or the pointer for RCU freeing.
>
> Enabling debugging metadata currently disables the cache merging.
> I doubt that it's acceptable to sacrifice the cache merging in
On Tue, 12 May 2020, Srikar Dronamraju wrote:
> +#ifdef CONFIG_NUMA
> + [N_ONLINE] = NODE_MASK_NONE,
Again. Same issue as before. If you do this then you do a global change
for all architectures. You need to put something in the early boot
sequence (in a non architecture specific way) that se
On Mon, 4 May 2020, Roman Gushchin wrote:
> On Sat, May 02, 2020 at 11:54:09PM +, Christoph Lameter wrote:
> > On Thu, 30 Apr 2020, Roman Gushchin wrote:
> >
> > > Sorry, but what exactly do you mean?
> >
> > I think the right approach is to add a pointer to each slab object for
> > memcg supp
On Mon, 4 May 2020, Andrew Morton wrote:
> But I guess it's better than nothing at all, unless there are
> alternative ideas?
I its highly unsusual to have such large partial lists. In a typical case
allocations whould reduce the size of the lists. 1000s? That is scary.
Are there inodes or dentr
On Sun, 3 May 2020, Rafael Aquini wrote:
> On Sat, May 02, 2020 at 11:16:30PM +0000, Christopher Lameter wrote:
> > On Fri, 1 May 2020, Rafael Aquini wrote:
> >
> > > Sometimes it is desirable to override SLUB's debug facilities
> > > default behavior upo
On Thu, 30 Apr 2020, Roman Gushchin wrote:
> Sorry, but what exactly do you mean?
I think the right approach is to add a pointer to each slab object for
memcg support.
On Fri, 1 May 2020, Rafael Aquini wrote:
> Sometimes it is desirable to override SLUB's debug facilities
> default behavior upon stumbling on a cache or object error
> and just stop the execution in order to grab a coredump, at
> the error-spotting time, instead of trying to fix the issue
> and re
On Fri, 1 May 2020, Srikar Dronamraju wrote:
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -116,8 +116,10 @@ EXPORT_SYMBOL(latent_entropy);
> */
> nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
> [N_POSSIBLE] = NODE_MASK_ALL,
> +#ifdef CONFIG_NUMA
> + [N_ONLINE] = NOD
On Fri, 1 May 2020, Srikar Dronamraju wrote:
> - for_each_present_cpu(cpu)
> - numa_setup_cpu(cpu);
> + for_each_possible_cpu(cpu) {
> + /*
> + * Powerpc with CONFIG_NUMA always used to have a node 0,
> + * even if it was memoryless or cpul
On Mon, 27 Apr 2020, Roman Gushchin wrote:
> > Why do you need this? Just slap a pointer to the cgroup as additional
> > metadata onto the slab object. Is that not much simpler, safer and faster?
> >
>
> So, the problem is that not all slab objects are accounted, and sometimes
> we don't know if a
On Mon, 21 Oct 2019, Roman Gushchin wrote:
> Sp far I haven't noticed any regression on the set of workloads where I did
> test
> the patchset, but if you know any benchmark or realistic test which can
> affected
> by this check, I'll be happy to try.
>
> Also, less-than-word-sized operations ca
On Thu, 17 Oct 2019, Roman Gushchin wrote:
> Currently s8 type is used for per-cpu caching of per-node statistics.
> It works fine because the overfill threshold can't exceed 125.
>
> But if some counters are in bytes (and the next commit in the series
> will convert slab counters to bytes), it's
On Thu, 17 Oct 2019, Roman Gushchin wrote:
> But if some counters are in bytes (and the next commit in the series
> will convert slab counters to bytes), it's not gonna work:
> value in bytes can easily exceed s8 without exceeding the threshold
> converted to bytes. So to avoid overfilling per-cpu
Acked-by: Chistoph Lameter
On Sat, 28 Sep 2019, Kaitao Cheng wrote:
> There is no need to make the 'node_order' variable static
> since new value always be assigned before use it.
In the past MAX_NUMMNODES could become quite large like 512 or 1k. Large
array allocations on the stack are problematic.
Maybe that is no longe
On Mon, 16 Sep 2019, Pengfei Li wrote:
> The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> generated by kmalloc_cache_name().
>
> Patch1 predefines the names of all types of kmalloc to save
> the time spent dynamica
On Mon, 16 Sep 2019, Pengfei Li wrote:
> KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
> is initialized by different types of the same size.
>
> So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
> will benefit performance.
Why would that increase performa
On Wed, 11 Sep 2019, Yu Zhao wrote:
> Though I have no idea what the side effect of such race would be,
> apparently we want to prevent the free list from being changed
> while debugging the objects.
process_slab() is called under the list_lock which prevents any allocation
from the free list in
On Wed, 4 Sep 2019, Pengfei Li wrote:
> There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
> and KMALLOC_DMA.
I only got a few patches of this set. Can I see the complete patchset
somewhere?
On Sat, 31 Aug 2019, Matthew Wilcox wrote:
> > The current behavior without special alignment for these caches has been
> > in the wild for over a decade. And this is now coming up?
>
> In the wild ... and rarely enabled. When it is enabled, it may or may
> not be noticed as data corruption, or t
On Wed, 17 Jul 2019, Waiman Long wrote:
> The show method of /sys/kernel/slab//shrink sysfs file currently
> returns nothing. This is now modified to show the time of the last
> cache shrink operation in us.
What is this useful for? Any use cases?
> CONFIG_SLUB_DEBUG depends on CONFIG_SYSFS. So
On Wed, 17 Jul 2019, Waiman Long wrote:
> Currently, a value of '1" is written to /sys/kernel/slab//shrink
> file to shrink the slab by flushing out all the per-cpu slabs and free
> slabs in partial lists. This can be useful to squeeze out a bit more memory
> under extreme condition as well as mak
On Mon, 8 Jul 2019, Marco Elver wrote:
> This refactors common code of ksize() between the various allocators
> into slab_common.c: __ksize() is the allocator-specific implementation
> without instrumentation, whereas ksize() includes the required KASAN
> logic.
Acked-by: Christoph Lameter
On Fri, 5 Jul 2019, Markus Elfring wrote:
> Avoid an extra function call by using a ternary operator instead of
> a conditional statement for a string literal selection.
Well. I thought the compiler does that on its own? And the tenary operator
makes the code difficult to read.
On Wed, 3 Jul 2019, Waiman Long wrote:
> On 7/3/19 2:56 AM, Michal Hocko wrote:
> > On Tue 02-07-19 14:37:30, Waiman Long wrote:
> >> Currently, a value of '1" is written to /sys/kernel/slab//shrink
> >> file to shrink the slab by flushing all the per-cpu slabs and free
> >> slabs in partial lists
On Thu, 27 Jun 2019, Roman Gushchin wrote:
> so that objects belonging to different memory cgroups can share the same page
> and kmem_caches.
>
> It's a fairly big change though.
Could this be done at another level? Put a cgoup pointer into the
corresponding structures and then go back to just a
On Mon, 3 Jun 2019, Minchan Kim wrote:
> @@ -415,6 +416,128 @@ static long madvise_cold(struct vm_area_struct *vma,
> return 0;
> }
>
> +static int madvise_pageout_pte_range(pmd_t *pmd, unsigned long addr,
> + unsigned long end, struct mm_walk *walk)
> +{
> +
On Thu, 16 May 2019, Qian Cai wrote:
> It turned out that DEBUG_SLAB_LEAK is still broken even after recent
> recue efforts that when there is a large number of objects like
> kmemleak_object which is normal on a debug kernel,
Acked-by: Christoph Lameter
On Tue, 14 May 2019, Roman Gushchin wrote:
> To make this possible we need to introduce a new percpu refcounter
> for non-root kmem_caches. The counter is initialized to the percpu
> mode, and is switched to atomic mode after deactivation, so we never
> shutdown an active cache. The counter is bum
On Wed, 8 May 2019, Roman Gushchin wrote:
> Currently the page accounting code is duplicated in SLAB and SLUB
> internals. Let's move it into new (un)charge_slab_page helpers
> in the slab_common.c file. These helpers will be responsible
> for statistics (global and memcg-aware) and memcg charging
On Mon, 29 Apr 2019, Christoph Hellwig wrote:
> So maybe it it time to mark SN2 broken and see if anyone screams?
>
> Without SN2 the whole machvec mess could basically go away - the
> only real difference between the remaining machvecs is which iommu
> if any we set up.
SPARSEMEM with VMEMMAP wa
On Wed, 24 Apr 2019, Matthew Garrett wrote:
> Applications that hold secrets and wish to avoid them leaking can use
> mlock() to prevent the page from being pushed out to swap and
> MADV_DONTDUMP to prevent it from being included in core dumps. Applications
> can also use atexit() handlers to over
On Fri, 19 Apr 2019, Matthew Wilcox wrote:
> ia64 (looks complicated ...)
Well as far as I can tell it was not even used 12 or so years ago on
Itanium when I worked on that stuff.
On Wed, 17 Apr 2019, Roman Gushchin wrote:
> static __always_inline int memcg_charge_slab(struct page *page,
>gfp_t gfp, int order,
>struct kmem_cache *s)
> {
> - if (is_root_cache(s))
> + int idx = (
On Wed, 17 Apr 2019, Roman Gushchin wrote:
> Let's make every page to hold a reference to the kmem_cache (we
> already have a stable pointer), and make kmem_caches to hold a single
> reference to the memory cgroup.
Ok you are freeing one word in the page struct that can be used for other
purposes
Please respond to my comments in the way that everyone else communicates
here. I cannot distinguish what you said from what I said before.
On Tue, 9 Apr 2019, Pankaj Suryawanshi wrote:
> I am confuse about memory configuration and I have below questions
Hmmm... Yes some of the terminology that you use is a bit confusing.
> 1. if 32-bit os maximum virtual address is 4GB, When i have 4 gb of ram
> for 32-bit os, What about the virtu
On Sun, 7 Apr 2019, Linus Torvalds wrote:
> On Sat, Apr 6, 2019 at 12:59 PM Qian Cai wrote:
> >
> > The commit 510ded33e075 ("slab: implement slab_root_caches list")
> > changes the name of the list node within "struct kmem_cache" from
> > "list" to "root_caches_node", but leaks_show() still use
On Wed, 13 Mar 2019, Barret Rhoden wrote:
> > It is very expensive. VMSP exchanges 4K segments via RDMA between servers
> > to build a large address space and run a kernel in the large address
> > space. Using smaller segments can cause a lot of
> > "cacheline" bouncing (meaning transfers of 4K se
On Thu, 4 Apr 2019, Vlastimil Babka wrote:
> Some debugging checks in SLUB are not hidden behind kmem_cache_debug() check.
> Add the check so that those places can also benefit from reduced overhead
> thanks to the the static key added by the previous patch.
Hmmm... I would not expect too much of
On Thu, 4 Apr 2019, Vlastimil Babka wrote:
> I looked a bit at SLUB debugging capabilities and first thing I noticed is
> there's no static key guarding the runtime enablement as is common for similar
> debugging functionalities, so here's a RFC to add it. Can be further improved
> if there's inte
On Wed, 3 Apr 2019, Al Viro wrote:
> > This is an RFC and we want to know how to do this right.
>
> If by "how to do it right" you mean "expedit kicking out something with
> non-zero refcount" - there's no way to do that. Nothing even remotely
> sane.
Sure we know that.
> If you mean "kick out
On Wed, 3 Apr 2019, Al Viro wrote:
> Let's do d_invalidate() on random dentries and hope they go away.
> With convoluted and brittle logics for deciding which ones to
> spare, which is actually wrong. This will pick mountpoints
> and tear them out, to start with.
>
> NAKed-by: Al Viro
>
> And th
Acked-by: Christoph Lameter
Acked-by: Christoph Lameter
On Wed, 3 Apr 2019, Tobin C. Harding wrote:
> Add function list_rotate_to_front() to rotate a list until the specified
> item is at the front of the list.
Reviewed-by: Christoph Lameter
On Wed, 3 Apr 2019, Tobin C. Harding wrote:
> Currently we reach inside the list_head. This is a violation of the
> layer of abstraction provided by the list_head. It makes the code
> fragile. More importantly it makes the code wicked hard to understand.
Great It definitely makes it cleare
On Tue, 26 Mar 2019, Qian Cai wrote:
> + if (!object) {
> + /*
> + * The tracked memory was allocated successful, if the kmemleak
> + * object failed to allocate for some reasons, it ends up with
> + * the whole kmemleak disabled, so let it su
On Mon, 25 Mar 2019, Matthew Wilcox wrote:
> Options:
>
> 1. Dispense with this optimisation and always store the size of the
> object before the object.
I think thats how SLOB handled it at some point in the past. Lets go back
to that setup so its compatible with the other allocators?
On Fri, 22 Mar 2019, Matthew Wilcox wrote:
> On Fri, Mar 22, 2019 at 07:39:31PM +0000, Christopher Lameter wrote:
> > On Fri, 22 Mar 2019, Waiman Long wrote:
> >
> > > >
> > > >> I am looking forward to it.
> > > > There is also alrady rcu bei
On Fri, 22 Mar 2019, Waiman Long wrote:
> >
> >> I am looking forward to it.
> > There is also alrady rcu being used in these paths. kfree_rcu() would not
> > be enough? It is an estalished mechanism that is mature and well
> > understood.
> >
> In this case, the memory objects are from kmem cache
On Thu, 21 Mar 2019, Waiman Long wrote:
> When releasing kernel data structures, freeing up the memory
> occupied by those objects is usually the last step. To avoid races,
> the release operation is commonly done with a lock held. However, the
> freeing operations do not need to be under lock, bu
On Fri, 22 Mar 2019, Waiman Long wrote:
> I am looking forward to it.
There is also alrady rcu being used in these paths. kfree_rcu() would not
be enough? It is an estalished mechanism that is mature and well
understood.
On Thu, 21 Mar 2019, Li RongQing wrote:
> nc is a member of percpu allocation memory, and impossible NULL
Acked-by: Christoph Lameter
On Tue, 19 Mar 2019, John Hubbard wrote:
> >
> > My concerns do not affect this patchset which just marks the get/put for
> > the pagecache. The problem was that the description was making claims that
> > were a bit misleading and seemed to prescribe a solution.
> >
> > So lets get this merged. Wh
On Wed, 20 Mar 2019, Dave Chinner wrote:
> So the plan for GUP vs writeback so far is "break fsync()"? :)
Well if its an anonymous page and not a file backed page then the
semantics are preserved. Disallow GUP long term pinning (marking stuff like in
this
patchset may make that possible) and its
On Fri, 8 Mar 2019, john.hubb...@gmail.com wrote:
> We seem to have pretty solid consensus on the concept and details of the
> put_user_pages() approach. Or at least, if we don't, someone please speak
> up now. Christopher Lameter, especially, since you had some concerns
> recen
On Wed, 13 Mar 2019, Christoph Hellwig wrote:
> On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote:
> > On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote:
> > > IMHO I don't think that the copy_file_range() is going to carry us
> > > through the
> > > next wave of user performan
On Tue, 12 Mar 2019, Jerome Glisse wrote:
> > > This has been discuss extensively already. GUP usage is now widespread in
> > > multiple drivers, removing that would regress userspace ie break existing
> > > application. We all know what the rules for that is.
You are still misstating the issue.
Acked-by: Christoph Lameter
On Wed, 13 Mar 2019, Tobin C. Harding wrote:
> @@ -297,7 +297,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int
> align, int node)
> continue;
>
> /* Attempt to alloc */
> - prev = sp->lru.prev;
> + prev = sp->slab_list.prev;
>
Acked-by: Christoph Lameter
Acked-by: Christoph Lameter
On Mon, 11 Mar 2019, Dave Chinner wrote:
> > Direct IO on a mmapped file backed page doesnt make any sense.
>
> People have used it for many, many years as zero-copy data movement
> pattern. i.e. mmap the destination file, use direct IO to DMA direct
> into the destination file page cache pages, f
On Fri, 8 Mar 2019, Jerome Glisse wrote:
> >
> > It would good if that understanding would be enforced somehow given the
> > problems
> > that we see.
>
> This has been discuss extensively already. GUP usage is now widespread in
> multiple drivers, removing that would regress userspace ie break e
On Mon, 11 Mar 2019, Roman Gushchin wrote:
> > +static inline void *alloc_scratch(struct kmem_cache *s)
> > +{
> > + unsigned int size = oo_objects(s->max);
> > +
> > + return kmalloc(size * sizeof(void *) +
> > + BITS_TO_LONGS(size) * sizeof(unsigned long),
> > +
On Mon, 11 Mar 2019, Roman Gushchin wrote:
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4325,6 +4325,34 @@ int __kmem_cache_create(struct kmem_cache *s,
> > slab_flags_t flags)
> > return err;
> > }
> >
> > +void kmem_cache_setup_mobility(struct kmem_cache *s,
> > +
On Fri, 8 Mar 2019, Tycho Andersen wrote:
> On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index f9d89c1b5977..754acdb292e4 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -298,6 +298,10 @@ int slab_unm
On Wed, 6 Mar 2019, john.hubb...@gmail.com wrote:
> GUP was first introduced for Direct IO (O_DIRECT), allowing filesystem code
> to get the struct page behind a virtual address and to let storage hardware
> perform a direct copy to or from that page. This is a short-lived access
> pattern, and a
On Wed, 6 Mar 2019, john.hubb...@gmail.com wrote:
> Dave Chinner's description of this is very clear:
>
> "The fundamental issue is that ->page_mkwrite must be called on every
> write access to a clean file backed page, not just the first one.
> How long the GUP reference lasts is irre
On Fri, 1 Mar 2019, Barret Rhoden wrote:
> I'm not familiar with VSMP - how bad is it to use L1 cache alignment instead
> of 4K page alignment? Maybe some structures can use the smaller alignment?
> Or maybe have VSMP require SRCU-using modules to be built-in?
It is very expensive. VMSP exchange
On Thu, 28 Feb 2019, Shaobo He wrote:
> I think maybe the more problematic issue is that the value of a freed pointer
> is intermediate.
The pointer is not affected by freeing the data it points to. Thus it
definitely has the same value as before and is not indeterminate.
The pointer points now
On Mon, 25 Feb 2019, Dennis Zhou wrote:
> > @@ -27,7 +27,7 @@
> > * chunk size is not aligned. percpu-km code will whine about it.
> > */
> >
> > -#if defined(CONFIG_SMP) && defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > +#if defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > #error "con
On Mon, 25 Feb 2019, den...@kernel.org wrote:
> > @@ -67,7 +67,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp)
> > pcpu_set_page_chunk(nth_page(pages, i), chunk);
> >
> > chunk->data = pages;
> > - chunk->base_addr = page_address(pages) - pcpu_group_offsets[0];
> > +
On Fri, 15 Feb 2019, Ira Weiny wrote:
> > > > for filesystems and processes. The only problems come in for the things
> > > > which bypass the page cache like O_DIRECT and DAX.
> > >
> > > It makes a lot of sense since the filesystems play COW etc games with the
> > > pages and RDMA is very much
On Fri, 15 Feb 2019, Matthew Wilcox wrote:
> > Since RDMA is something similar: Can we say that a file that is used for
> > RDMA should not use the page cache?
>
> That makes no sense. The page cache is the standard synchronisation point
> for filesystems and processes. The only problems come in
On Fri, 15 Feb 2019, Dave Chinner wrote:
> Which tells us filesystem people that the applications are doing
> something that _will_ cause data corruption and hence not to spend
> any time triaging data corruption reports because it's not a
> filesystem bug that caused it.
>
> See open(2):
>
>
1 - 100 of 384 matches
Mail list logo