Hi, [Add jhb@ to the CC list]
On Mon, Oct 3, 2011 at 1:34 PM, <[email protected]> wrote: > On Mon, Oct 3, 2011 at 10:24 AM, Arnaud Lacombe <[email protected]> wrote: >> Hi, >> >> On Mon, Oct 3, 2011 at 12:31 PM, <[email protected]> wrote: >>> On Mon, Oct 3, 2011 at 7:55 AM, satish kondapalli <[email protected]> >>> wrote: >>>> I am new to FreeBSD, I just want know whether FreeBSD supports NUMA. >>>> If FreeBSD supports NUMA what are the kernel API to allocate memory? >>>> is there any example driver or any driver which is using the NUMA API? >>>> >>>> please provide some inputs... >>> >>> The kernel is NUMA-aware (at least for x86), >>> >> What "x86" ? i386 ? amd64 ? both ? > > Both; see sys/x86/acpica/srat.c which parses the SRAT table. > >>> and memory is allocated >>> round-robin amongst the memory domains. There are not yet any KPIs >>> for allocating memory in a specific NUMA domain, nor for binding >>> specific threads / processes to get their memory local to a bound cpu >>> instead of round robin. >>> >> I'm not sure to follow you. Say you have 2 memory domain attached to 2 >> different CPU package, each providing a memory domain, 4 physical core >> and eventually 8 virtual. Say you have a network adapter supporting 8 >> RX/TX queue, dispatching RX packet to 8 netisr. Ideally, you'd want >> those 8 queue/netisr to each have an affinity for a given CPU/memory >> domain, have the network adapter route flow evenly on those those 8 >> CPU. Now, if you allocated an mbuf from memory domain 1, and end up >> being processed by a CPU in domain 0, that likely to introduce >> performance penalty. > > Your statement isn't incorrect. What I'm saying is that there's no > KPI for requesting bound memory because, while the netstat example is > a fine one for where local memory is desired, the majority [1] of > processing is not bound to a CPU and so round-robin allocations will > produce uniform performance results -- that is, not the best possible, > but not wildly fluctuating as scheduling decisions over different runs > give different remote memory penalties. > > [1] for some definition of 'majority'. > >> Now, what about userland ? >> >> This is certainly an horribly big picture :/ > > Yes, and it's why I said just that there's no KPI. One reason there > is no KPI is that there's a lot of fiddly bits to take into account. > > My experience at IBM on AIX was that NUMA is very easy to get wrong; > specifically what one usually wants is for the OS to get the answer > right (especially for userspace) without a lot of manual tuning; > except for some specific applications like netstat queues or a machine > doing HPC or mostly running e.g. an Oracle db server, there's too much > happening for any one program to configure itself "right" for all the > uses of that code. I remember a lot of customer reports of problems > from overly aggressive local memory use. Most of the time no one > complained when things had consistent performance, even if that wasn't > quite as fast as possible. > Is there any project in progress to get this addressed ? In the past year, I can only see 3 commit related to NUMA, one of them being concerning only ia64. Btw, I'd be interested to see how FreeBSD 9.0 and a recent Linux kernel behave on +2 CPU package machines. - Arnaud [0]: http://lwn.net/Articles/254445/ > In fact, I may be wrong about the round-robin; I sent jhb@ a patch and > I have no recollection anymore whether it's actually in CURRENT. It's > been over a year since I thought about this much (BSDCan 2010 was the > last time I remember). > > Cheers, > matthew > _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[email protected]"

