Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-15 Thread Christoph Lameter
On Thu, 15 Dec 2016, Jesper Dangaard Brouer wrote: > > It sounds like Christoph's RDMA approach might be the way to go. > > I'm getting more and more fond of Christoph's RDMA approach. I do > think we will end-up with something close to that approach. I just > wanted to get review on my idea fir

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-14 Thread Christoph Lameter
On Wed, 14 Dec 2016, Hannes Frederic Sowa wrote: > Wouldn't changing of the pages cause expensive TLB flushes? Yes so you would only want that feature if its realized at the page table level for debugging issues. Once you have memory registered with the hardware device then also the device could

RE: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-14 Thread Christoph Lameter
On Wed, 14 Dec 2016, David Laight wrote: > If the kernel is doing ANY validation on the frames it must copy the > data to memory the application cannot modify before doing the validation. > Otherwise the application could change the data afterwards. The application is not allowed to change the da

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-14 Thread Christoph Lameter
On Tue, 13 Dec 2016, Hannes Frederic Sowa wrote: > > Interesting. So you even imagine sockets registering memory regions > > with the NIC. If we had a proper NIC HW filter API across the drivers, > > to register the steering rule (like ibv_create_flow), this would be > > doable, but we don't (DP

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-13 Thread Christoph Lameter
On Tue, 13 Dec 2016, Jesper Dangaard Brouer wrote: > This is the early demux problem. With the push-mode of registering > memory, you need hardware steering support, for zero-copy support, as > the software step happens after DMA engine have written into the memory. Right. But we could fall back

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-12 Thread Christoph Lameter
On Mon, 12 Dec 2016, Jesper Dangaard Brouer wrote: > Hmmm. If you can rely on hardware setup to give you steering and > dedicated access to the RX rings. In those cases, I guess, the "push" > model could be a more direct API approach. If the hardware does not support steering then one should be

Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver

2016-11-22 Thread Christoph Lameter
On Tue, 22 Nov 2016, Vishwanathapura, Niranjana wrote: > Ok, I do understand Jason's point that we should probably not put this driver > under drivers/infiniband/sw/.., as this driver is not a HCA. > It is an ULP similar to ipoib, built on top of Omni-path irrespective of > whether we register a h

Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists

2015-10-02 Thread Christoph Lameter
On Fri, 2 Oct 2015, Jesper Dangaard Brouer wrote: > Thus, I need introducing new code like this patch and at the same time > have to reduce the number of instruction-cache misses/usage. In this > case we solve the problem by kmem_cache_free_bulk() not getting called > too often. Thus, +17 bytes w

Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists

2015-09-30 Thread Christoph Lameter
On Wed, 30 Sep 2015, Jesper Dangaard Brouer wrote: > Make it possible to free a freelist with several objects by adjusting > API of slab_free() and __slab_free() to have head, tail and an objects > counter (cnt). Acked-by: Christoph Lameter -- To unsubscribe from this list: send

Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists

2015-09-28 Thread Christoph Lameter
On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote: > Not knowing SLUB as well as you, it took me several hours to realize > init_object() didn't overwrite the freepointer in the object. Thus, I > think these comments make the reader aware of not-so-obvious > side-effects of SLAB_POISON and SLAB_R

Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists

2015-09-28 Thread Christoph Lameter
On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote: > > Do you really need separate parameters for freelist_head? If you just want > > to deal with one object pass it as freelist_head and set cnt = 1? > > Yes, I need it. We need to know both the head and tail of the list to > splice it. Ok so thi

Re: [PATCH 5/7] slub: support for bulk free with SLUB freelists

2015-09-28 Thread Christoph Lameter
On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote: > diff --git a/mm/slub.c b/mm/slub.c > index 1cf98d89546d..13b5f53e4840 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -675,11 +675,18 @@ static void init_object(struct kmem_cache *s, void > *object, u8 val) > { > u8 *p = object; > > +

Re: [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist

2015-09-28 Thread Christoph Lameter
Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/7] slab: implement bulking for SLAB allocator

2015-09-28 Thread Christoph Lameter
_disable(); > + for (i = 0; i < size; i++) > + __kmem_cache_free(s, p[i], false); > + local_irq_enable(); > +} > +EXPORT_SYMBOL(kmem_cache_free_bulk); Same concern here. We may just have to accept this for now. Acked-by: Christoph Lameter -- To unsubscribe from

Re: [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG

2015-09-28 Thread Christoph Lameter
On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote: > The #ifdef of CONFIG_SLUB_DEBUG is located very far from > the associated #else. For readability mark it with a comment. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe netdev"

Re: Experiences with slub bulk use-case for network stack

2015-09-17 Thread Christoph Lameter
On Thu, 17 Sep 2015, Jesper Dangaard Brouer wrote: > What I'm proposing is keeping interrupts on, and then simply cmpxchg > e.g 2 slab-pages out of the SLUB allocator (which the SLUB code calls > freelist's). The bulk call now owns these freelists, and returns them > to the caller. The API caller

Re: Experiences with slub bulk use-case for network stack

2015-09-16 Thread Christoph Lameter
On Wed, 16 Sep 2015, Jesper Dangaard Brouer wrote: > > Hint, this leads up to discussing if current bulk *ALLOC* API need to > be changed... > > Alex and I have been working hard on practical use-case for SLAB > bulking (mostly slUb), in the network stack. Here is a summary of > what we have lear

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-09 Thread Christoph Lameter
On Wed, 9 Sep 2015, Jesper Dangaard Brouer wrote: > > Hmmm... Guess we need to come up with distinct version of kmalloc() for > > irq and non irq contexts to take advantage of that . Most at non irq > > context anyways. > > I agree, it would be an easy win. Do notice this will have the most > imp

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-08 Thread Christoph Lameter
On Sat, 5 Sep 2015, Jesper Dangaard Brouer wrote: > The double_cmpxchg without lock prefix still cost 9 cycles, which is > very fast but still a cost (add approx 19 cycles for a lock prefix). > > It is slower than local_irq_disable + local_irq_enable that only cost > 7 cycles, which the bulking ca

Re: [PATCH mm] slab: implement bulking for SLAB allocator

2015-09-08 Thread Christoph Lameter
On Tue, 8 Sep 2015, Jesper Dangaard Brouer wrote: > This test was a single CPU benchmark with no congestion or concurrency. > But the code was compiled with CONFIG_NUMA=y. > > I don't know the slAb code very well, but the kmem_cache_node->list_lock > looks like a scalability issue. I guess that i

Re: [PATCH mm] slab: implement bulking for SLAB allocator

2015-09-08 Thread Christoph Lameter
On Tue, 8 Sep 2015, Jesper Dangaard Brouer wrote: > Also notice how well bulking maintains the performance when the bulk > size increases (which is a soar spot for the slub allocator). Well you are not actually completing the free action in SLAB. This is simply queueing the item to be freed later

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Christoph Lameter
On Fri, 4 Sep 2015, Alexander Duyck wrote: > Right, but one of the reasons for Jesper to implement the bulk alloc/free is > to avoid the cmpxchg that is being used to get stuff into or off of the per > cpu lists. There is no full cmpxchg used for the per cpu lists. Its a cmpxchg without lock seman

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Christoph Lameter
On Fri, 4 Sep 2015, Alexander Duyck wrote: > were to create a per-cpu pool for skbs that could be freed and allocated in > NAPI context. So for example we already have napi_alloc_skb, why not just add > a napi_free_skb and then make the array of objects to be freed part of a pool > that could be

Re: [PATCH 7/7] slub: initial bulk free implementation

2015-06-16 Thread Christoph Lameter
On Tue, 16 Jun 2015, Jesper Dangaard Brouer wrote: > It is very important that everybody realizes that the save+restore > variant is very expensive, this is key: > > CPU: i7-4790K CPU @ 4.00GHz > * local_irq_{disable,enable}: 7 cycles(tsc) - 1.821 ns > * local_irq_{save,restore} : 37 cycles(ts

Re: [PATCH 7/7] slub: initial bulk free implementation

2015-06-16 Thread Christoph Lameter
On Tue, 16 Jun 2015, Joonsoo Kim wrote: > So, in your test, most of objects may come from one or two slabs and your > algorithm is well optimized for this case. But, is this workload normal case? It is normal if the objects were bulk allocated because SLUB ensures that all objects are first alloc

Re: [PATCH 7/7] slub: initial bulk free implementation

2015-06-16 Thread Christoph Lameter
On Tue, 16 Jun 2015, Joonsoo Kim wrote: > > If adding these, then I would also need to add those on alloc path... > > Yes, please. Lets fall back to the generic implementation for any of these things. We need to focus on maximum performance in these functions. The more special cases we have to ha

Re: [PATCH 2/7] slub bulk alloc: extract objects from the per cpu slab

2015-06-16 Thread Christoph Lameter
On Tue, 16 Jun 2015, Joonsoo Kim wrote: > Now I found that we need to call slab_pre_alloc_hook() before any operation > on kmem_cache to support kmemcg accounting. And, we need to call > slab_post_alloc_hook() on every allocated objects to support many > debugging features like as kasan and kmemle

Re: [PATCH 1/7] slab: infrastructure for bulk object allocation and freeing

2015-06-15 Thread Christoph Lameter
On Mon, 15 Jun 2015, Alexander Duyck wrote: > So I can see the motivation behind bulk allocation, but I cannot see the > motivation behind bulk freeing. In the case of freeing the likelihood of the > memory regions all belonging to the same page just isn't as high. The likelyhood is high if the

Re: [PATCH 6/7] slub: improve bulk alloc strategy

2015-06-15 Thread Christoph Lameter
On Mon, 15 Jun 2015, Jesper Dangaard Brouer wrote: > - break; > + if (unlikely(!object)) { > + c->tid = next_tid(c->tid); tid increment is not needed here since the per cpu information is not modified. > + local_irq_enable()

Re: [PATCH 7/7] slub: initial bulk free implementation

2015-06-15 Thread Christoph Lameter
On Mon, 15 Jun 2015, Jesper Dangaard Brouer wrote: > + for (i = 0; i < size; i++) { > + void *object = p[i]; > + > + if (unlikely(!object)) > + continue; // HOW ABOUT BUG_ON()??? Sure BUG_ON would be fitting here. > + > + page = virt_to

Re: [RFC PATCH] slub: RFC: Improving SLUB performance with 38% on NO-PREEMPT

2015-06-08 Thread Christoph Lameter
On Mon, 8 Jun 2015, Jesper Dangaard Brouer wrote: > My real question is if disabling local interrupts is enough to avoid this? Yes the initial release of slub used interrupt disable in the fast paths. > And, does local irq disabling also stop preemption? Of course. -- To unsubscribe from this

Re: [RFC PATCH v2 02/11] slab: add private memory allocator header for arch/lib

2015-04-17 Thread Christoph Lameter
On Fri, 17 Apr 2015, Richard Weinberger wrote: > SLUB is the unqueued SLAB and SLLB is the library SLAB. :D Good that this convention is now so broadly known that I did not even have to explain what it meant. But I think you can give it any name you want. SLLB was just a way to tersely state how

Re: [RFC PATCH v2 02/11] slab: add private memory allocator header for arch/lib

2015-04-17 Thread Christoph Lameter
On Fri, 17 Apr 2015, Hajime Tazaki wrote: > add header includion for CONFIG_LIB to wrap kmalloc and co. This will > bring malloc(3) based allocator used by arch/lib. Maybe add another allocator insteadl? SLLB which implements memory management using malloc()? -- To unsubscribe from this list: sen

Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode

2007-12-17 Thread Christoph Lameter
On Sun, 16 Dec 2007, Herbert Xu wrote: > If we can get the address of the per-cpu counter against > some sort of a per-cpu base pointer, e.g., %gs on x86, then > we can do > > incq%gs:(%rax) > > where %rax would be the offset with %gs as the base. This would > obviate the need for the

Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode

2007-12-17 Thread Christoph Lameter
The cpu alloc patches also fix this issue one way (disabling preempt) or the other (atomic instruction that does not need disabling of preeemption). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://v

Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench

2007-11-14 Thread Christoph Lameter
On Wed, 14 Nov 2007, David Miller wrote: > > As a result, we may allocate more than a page of data in the > > non-TSO case when exactly one page is desired. Well this is likely the result of the SLUB regression. If you allocate an order 1 page then the zone locks need to be taken. SLAB queues th

Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench

2007-11-14 Thread Christoph Lameter
On Wed, 14 Nov 2007, David Miller wrote: > > Still interested to know why SLAB didn't see the same thing... > > Yes, I wonder why too. I bet objects just got packed differently. The objects are packed tightly in SLUB and SLUB can allocate smaller objects (minimum is 8 SLAB mininum is 32). On

Re: RFC: Reproducible oops with lockdep on count_matching_names()

2007-11-05 Thread Christoph Lameter
On Mon, 5 Nov 2007, Michael Buesch wrote: > Hm, I don't really remember. Though, I usually have all almost kernel-hacking > options enabled. > I'll check and enable some more. slub_debug must be specified on the command line. Alternately switch on CONFIG_SLUB_DEBUG_ON in the .config to force it t

Re: [PATCH] Document non-semantics of atomic_read() and atomic_set()

2007-09-11 Thread Christoph Lameter
Acked-by: Christoph Lameter <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-09-10 Thread Christoph Lameter
On Mon, 10 Sep 2007, Paul E. McKenney wrote: > The one exception to this being the case where process-level code is > communicating to an interrupt handler running on that same CPU -- on > all CPUs that I am aware of, a given CPU always sees its own writes > in order. Yes but that is due to the c

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-09-10 Thread Christoph Lameter
On Mon, 10 Sep 2007, Linus Torvalds wrote: > The fact is, "volatile" *only* makes things worse. It generates worse > code, and never fixes any real bugs. This is a *fact*. Yes, lets just drop the volatiles now! We need a patch that gets rid of them Volunteers? - To unsubscribe from this l

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-09-10 Thread Christoph Lameter
On Fri, 17 Aug 2007, Segher Boessenkool wrote: > "volatile" has nothing to do with reordering. atomic_dec() writes > to memory, so it _does_ have "volatile semantics", implicitly, as > long as the compiler cannot optimise the atomic variable away > completely -- any store counts as a side effect.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-24 Thread Christoph Lameter
On Fri, 24 Aug 2007, Denys Vlasenko wrote: > On Thursday 16 August 2007 00:22, Paul Mackerras wrote: > > Satyam Sharma writes: > > In the kernel we use atomic variables in precisely those situations > > where a variable is potentially accessed concurrently by multiple > > CPUs, and where each CPU

Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

2007-08-24 Thread Christoph Lameter
On Fri, 24 Aug 2007, Satyam Sharma wrote: > But if people do seem to have a mixed / confused notion of atomicity > and barriers, and if there's consensus, then as I'd said earlier, I > have no issues in going with the consensus (eg. having API variants). > Linus would be more difficult to convince

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Christoph Lameter
On Fri, 17 Aug 2007, Paul E. McKenney wrote: > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote: > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote: > > > > > > gcc bugzilla bug #33102, for whatever that ends up being worth. ;-) > > > > I had totally forgotten that I'

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Christoph Lameter
On Thu, 16 Aug 2007, Chris Snook wrote: > atomic_dec() already has volatile behavior everywhere, so this is semantically > okay, but this code (and any like it) should be calling cpu_relax() each > iteration through the loop, unless there's a compelling reason not to. I'll > allow that for some h

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > The uses of atomic_read where one might want it to allow caching of > the result seem to me to fall into 3 categories: > > 1. Places that are buggy because of a race arising from the way it's >used. > > 2. Places where there is a race but it doesn

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > Herbert Xu writes: > > > It doesn't matter. The memory pressure flag is an *advisory* > > flag. If we get it wrong the worst that'll happen is that we'd > > waste some time doing work that'll be thrown away. > > Ah, so it's the "racy but I don't car

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > > It seems that there could be a lot of places where atomic_t is used in > a non-atomic fashion, and that those uses are either buggy, or there > is some lock held at the time which guarantees that other CPUs aren't > changing the value. In both cases

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2007, Herbert Xu wrote: > > Do we have a consensus here? (hoping against hope, probably :-) > > I can certainly agree with this. I agree too. > But I have to say that I still don't know of a single place > where one would actually use the volatile variant. I suspect that what yo

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Christoph Lameter wrote: > On Thu, 16 Aug 2007, Paul Mackerras wrote: > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > > > Are you sure? How do you know some other CPU hasn't changed the value > > in

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > > We don't need to reload sk->sk_prot->memory_allocated here. > > Are you sure? How do you know some other CPU hasn't changed the value > in between? The cpu knows because the cacheline was not invalidated. - To unsubscribe from this list: send the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > Understood. My point is not that the impact is precisely zero, but > rather that the impact on optimization is much less hurtful than the > problems that could arise otherwise, particularly as compilers become > more aggressive in their optimizations

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > The volatile cast should not disable all that many optimizations, > for example, it is much less hurtful than barrier(). Furthermore, > the main optimizations disabled (pulling atomic_read() and atomic_set() > out of loops) really do need to be disab

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > Seems to me that we face greater chance of confusion without the > volatile than with, particularly as compiler optimizations become > more aggressive. Yes, we could simply disable optimization, but > optimization can be quite helpful. A volatile de

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > Those barriers are for when we need ordering between atomic variables > and other memory locations. An atomic variable by itself doesn't and > shouldn't need any barriers for other CPUs to be able to see what's > happening to it. It does not need any

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2007, Paul Mackerras wrote: > In the kernel we use atomic variables in precisely those situations > where a variable is potentially accessed concurrently by multiple > CPUs, and where each CPU needs to see updates done by other CPUs in a > timely fashion. That is what they are for.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Stefan Richter wrote: > LDD3 says on page 125: "The following operations are defined for the > type [atomic_t] and are guaranteed to be atomic with respect to all > processors of an SMP computer." > > Doesn't "atomic WRT all processors" require volatility? Atomic operations

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Christoph Lameter
On Tue, 14 Aug 2007, Chris Snook wrote: > Because atomic operations are generally used for synchronization, which > requires volatile behavior. Most such codepaths currently use an inefficient > barrier(). Some forget to and we get bugs, because people assume that > atomic_read() actually reads

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Christoph Lameter
On Tue, 14 Aug 2007, Chris Snook wrote: > But barriers force a flush of *everything* in scope, which we generally don't > want. On the other hand, we pretty much always want to flush atomic_* > operations. One way or another, we should be restricting the volatile > behavior to the thing that nee

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Christoph Lameter
On Thu, 9 Aug 2007, Chris Snook wrote: > This patchset makes the behavior of atomic_read uniform by removing the > volatile keyword from all atomic_t and atomic64_t definitions that currently > have it, and instead explicitly casts the variable as volatile in > atomic_read(). This leaves little r

Re: [Bugme-new] [Bug 8778] New: Ocotea board: kernel reports access of bad area during boot with DEBUG_SLAB=y

2007-07-23 Thread Christoph Lameter
On Wed, 18 Jul 2007 09:55:37 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > hm. It should be the case that providing SLAB_HWCACHE_ALIGN at > kmem_cache_create() time will override slab-debugging's offsetting > of the returned addresses. That is true for SLUB but not in SLAB. SLAB has always i

Re: [PATCH 5/5] dma: use dev_to_node to get node for device in dma_alloc_pages

2007-07-23 Thread Christoph Lameter
On Tue, 10 Jul 2007 16:53:09 -0700 Yinghai Lu <[EMAIL PROTECTED]> wrote: > [PATCH 5/5] dma: use dev_to_node to get node for device in > dma_alloc_pages Acked-by: Christoph Lameter <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev"

Re: [PATCH 02/40] mm: slab allocation fairness

2007-05-16 Thread Christoph Lameter
On Fri, 4 May 2007, Peter Zijlstra wrote: > Page allocation rank is a scalar quantity connecting ALLOC_ and gfp flags > which > represents how deep we had to reach into our reserves when allocating a page. > Rank 0 is the deepest we can reach (ALLOC_NO_WATERMARK) and 16 is the most > shallow al

Re: select(0, ..) is valid ?

2007-05-15 Thread Christoph Lameter
On Tue, 15 May 2007, Andrew Morton wrote: > Perhaps putting a size=0 detector into slab also would speed this > process up. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6/mm/slab.c === --- linux

Re: select(0, ..) is valid ?

2007-05-15 Thread Christoph Lameter
On Tue, 15 May 2007, Andrew Morton wrote: > I _think_ we can just do > > --- a/fs/compat.c~a > +++ a/fs/compat.c > @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat >*/ > ret = -ENOMEM; > size = FDS_BYTES(n); > - bits = kmalloc(6 * size, GFP_KERNEL); > -

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Pekka Enberg wrote: > Christoph Lameter wrote: > > On Fri, 4 May 2007, Pekka Enberg wrote: > > > > > Again, slab has no way of actually estimating how many pages you need for > > > a > > > given number of objects. So we end up calcul

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Pekka Enberg wrote: > Christoph Lameter wrote: > > SLAB can calculate exactly how many pages are needed. The per cpu and per > > node stuff is setup at boot and does not change. We are talking about the > > worst case scenario here. True in case of an

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Pekka Enberg wrote: > Again, slab has no way of actually estimating how many pages you need for a > given number of objects. So we end up calculating some upper bound which > doesn't belong in mm/slab.c. I am perfectly okay with: It can give a worst case number and that is wha

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Pekka Enberg wrote: > > which would calculate the worst case memory scenario for allocation the > > number of indicated objects? > > IIRC this looks more or less what Peter had initially. I don't like the API > because there's no way for slab (perhaps this is different for slu

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Peter Zijlstra wrote: > > Ok so you really need the number of objects per page? If you know the > > number of objects then you can calculate the pages needed which would be > > the maximum memory needed? > > Yes, that would work. Hmmm... Maybe lets have unsigned kmem_estim

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Peter Zijlstra wrote: > > I could add a function that tells you how many object you could allocate > > from a slab without the page allocator becoming involved? It would count > > the object slots available on the partial slabs. > > I need to know how many pages to reserve t

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Peter Zijlstra wrote: > Expost buffer_size in order to allow fair estimates on the actual space > used/needed. If its just an estimate that you are after then I think ksize is sufficient. The buffer size does not include the other per slab overhead that SLAB needs nor the

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Peter Zijlstra wrote: > On Fri, 2007-05-04 at 09:09 -0700, Christoph Lameter wrote: > > On Fri, 4 May 2007, Pekka Enberg wrote: > > > > > On 5/4/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > > Expost buffer_size in order t

Re: [PATCH 08/40] mm: kmem_cache_objsize

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Pekka Enberg wrote: > On 5/4/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > Expost buffer_size in order to allow fair estimates on the actual space > > used/needed. We already have ksize? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a

Re: Possible ways of dealing with OOM conditions.

2007-01-19 Thread Christoph Lameter
On Thu, 18 Jan 2007, Peter Zijlstra wrote: > > > Cache misses for small packet flow due to the fact, that the same data > > is allocated and freed and accessed on different CPUs will become an > > issue soon, not right now, since two-four core CPUs are not yet to be > > very popular and price fo

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > > > Sure, but there is nothing wrong with using a slab page with a lower > > > allocation rank when there is memory aplenty. > > What does "a slab page with a lower allocation rank" mean? Slab pages have > > no allocation ranks that I am aware of. > I

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > Sure, but there is nothing wrong with using a slab page with a lower > allocation rank when there is memory aplenty. What does "a slab page with a lower allocation rank" mean? Slab pages have no allocation ranks that I am aware of. - To unsubscribe f

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > On Thu, 2006-11-30 at 10:52 -0800, Christoph Lameter wrote: > > > I would think that one would need a rank with each cached object and > > free slab in order to do this the right way. > > Allocation hardness is a temporal a

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > No, the forced allocation is to test the allocation hardness at that > point in time. I could not think of another way to test that than to > actually to an allocation. Typically we do this by checking the number of free pages in a zone compared to th

Re: [RFC][PATCH 5/6] slab: kmem_cache_objs_to_pages()

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > Right, perhaps my bad in wording the intent; the needed information is > how many more pages would I need to grow the slab with in order to store > so many new object. Would you not have to take objects currently available in caches into account? If y

Re: [RFC][PATCH 5/6] slab: kmem_cache_objs_to_pages()

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > +unsigned int kmem_cache_objs_to_pages(struct kmem_cache *cachep, int nr) > +{ > + return ((nr + cachep->num - 1) / cachep->num) << cachep->gfporder; cachep->num refers to the number of objects in a slab of gfporder. thus return (nr + cachep->num

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

2006-11-30 Thread Christoph Lameter
On Thu, 30 Nov 2006, Peter Zijlstra wrote: > The slab has some unfairness wrt gfp flags; when the slab is grown the gfp > flags are used to allocate more memory, however when there is slab space > available, gfp flags are ignored. Thus it is possible for less critical > slab allocations to succ

Re: [PATCH 0/4] VM deadlock prevention -v5

2006-08-25 Thread Christoph Lameter
On Fri, 25 Aug 2006, Peter Zijlstra wrote: > The basic premises is that network sockets serving the VM need undisturbed > functionality in the face of severe memory shortage. > > This patch-set provides the framework to provide this. Hmmm.. Is it not possible to avoid the memory pools by guaran

Re: [PATCH 1/1] network memory allocator.

2006-08-18 Thread Christoph Lameter
On Fri, 18 Aug 2006, Andi Kleen wrote: > Also I must say it's still not quite clear to me if it's better to place > network packets on the node the device is connected to or on the > node which contains the CPU who processes the packet data > For RX this can be three different nodes in the worst

Re: [PATCH 1/1] network memory allocator.

2006-08-17 Thread Christoph Lameter
On Wed, 16 Aug 2006, Andi Kleen wrote: > That's not true on all NUMA systems (that they have a slow interconnect) > I think on x86-64 I would prefer if it was distributed evenly or maybe even > on the CPU who is finally going to process it. > > -Andi "not all NUMA is an Altix" The Altix NUMA in

LRO Patent vs. patent free TOE

2005-08-22 Thread Christoph Lameter
On Sun, 21 Aug 2005, Leonid Grossman wrote: > Ahh, I was curious to see if someone will bring this argument up - in > fact, LRO legal issues do not exist, while TOE legal issues are quite > big at the moment. I guess this is one of the reasons why OpenRDMA and > other mainstream industry efforts d

RE: LRO Patent vs. patent free TOE

2005-08-22 Thread Christoph Lameter
On Mon, 22 Aug 2005, Leonid Grossman wrote: > With several tens of already granted and very broad TOE-related patents, > this statement sounds rather naïve, and I just wish anyone good luck > defending it in the future... Ummm. TOE has been around for 20 years now and there is lots of prior art.

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-21 Thread Christoph Lameter
On Sun, 21 Aug 2005, David S. Miller wrote: > LRO will work, and it's the negative attitude of the TOE folks that > inspires me to want to help out the LRO folks and ignore the TOE mania > altogether. Dave you critized the black and white attitude before. It seems that you are the only one in th

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-20 Thread Christoph Lameter
On Sat, 20 Aug 2005, David S. Miller wrote: > From: Christoph Lameter <[EMAIL PROTECTED]> > Date: Sat, 20 Aug 2005 21:16:16 -0700 (PDT) > > It does not exist today AFAIK. The hope of such a solution will prevent > > the inclusion of TOE technology that exists today? >

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-20 Thread Christoph Lameter
On Sat, 20 Aug 2005, David S. Miller wrote: > But by in large, if a stateless alternative ever exists to > get the same performance benefit as TOE, it will undoubtedly > be preferred by the Linux networking maintainers, by in large. > So you TOE guys are fighting more than an uphill battle. It do

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-20 Thread Christoph Lameter
On Sat, 20 Aug 2005, David S. Miller wrote: > Christoph, you're a really bright guy, perhaps you can sit and come up > with some other ideas which would act as stateless alternatives to > TOE? I bet you can do it, if you would simply try... I worked through the alternatives last year including s

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-20 Thread Christoph Lameter
On Fri, 19 Aug 2005, Andi Kleen wrote: > Hmm - but is a 9k or 16k packet on the wire not equivalent to a micro burst? > (actually it is not that micro compared to 1.5k packets). At least against > burstiness they don't help and make things even worse because the bursts > cannot be split up anymor

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-18 Thread Christoph Lameter
On Thu, 18 Aug 2005, David S. Miller wrote: > The same performance can be obtained with stateless offloads. > You continually ignore this possibility, as if TOE is the only > way. TCP is a stateful protocol and what can be done with stateless offloads is very limited. - To unsubscribe from this

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-18 Thread Christoph Lameter
On Thu, 18 Aug 2005, David S. Miller wrote: > Wouldn't you rather have a commoditized $40.00USD gigabit network card > that got TOE level performance? I guess that question's answer depends > upon whether you have some financial state in a company doing TOE :-) We may have TOE in $40 network car

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-18 Thread Christoph Lameter
On Thu, 18 Aug 2005, David S. Miller wrote: > And once again it will be "niche", and very far from commodity. > A specialized optimization for a very small and specialized audience, > ie. not appropriate for Linux upstream. The TOE method will gradually become standard simply because it allows p

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-18 Thread Christoph Lameter
On Thu, 18 Aug 2005, David S. Miller wrote: > This is what has always happened in the past, people were preaching > for TOE back when 100Mbit ethernet was "new and fast". But you > certainly don't see anyone trying to justify TOE for those link > speeds today. The same will happen for 1Gbit and

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-18 Thread Christoph Lameter
On Thu, 18 Aug 2005, David S. Miller wrote: > The point remains that TOE creates an ENORMOUS support burdon > upon us, and makes bugs harder to field even if we add the > "TOE Taint" thing. Simply switch off the TOE to see if its TOE or the OS stack. TCP is fairly standard though this is a pretty