Re: mbuf external buffer reference counters
On Thu, Jul 11, 2002 at 11:41:04PM -0700, Alfred Perlstein wrote: > > That's a cool idea.. haven't looked at NetBSD but am imagining the > > mbufs would be linked in a 'ring'. This works because you never > > care how many references are, just whether there's one or more than > > one, and this is easy to tell by examining the ring pointer. > > I.e., you never have to iterate through the entire ring. > > That's true, but could someone explain how one can safely and > effeciently manipulate such a structure in an SMP environment? > > I'm not saying it's impossible, I'm just saying it didn't seem > intuative to me back then, as well as now. I'm probably speaking out of turn here (I have no idea what structure you all are talking about), but a monodirectional ring can be safely modified with a compare-and-exchange atomic operation. -- Jonathan Mini <[EMAIL PROTECTED]> http://www.freebsd.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, Jul 12, 2002 at 12:10:41AM -0700, Alfred Perlstein wrote: > * Julian Elischer <[EMAIL PROTECTED]> [020712 00:00] wrote: > > > > > > On Thu, 11 Jul 2002, Alfred Perlstein wrote: > > > > > > That's true, but could someone explain how one can safely and > > > effeciently manipulate such a structure in an SMP environment? > > > > what does NetBSD do for that? > > They don't! > > *** waves skull staff exasperatedly *** > > RORWLRLRLLRL Again, Alfred is right. :-) I can't think of a way to ensure that the owner of the other mbuf doesn't manipulate its two forward/backward pointers while we're manipulating ours. The only way that springs to mind is to have them protected by a mutex, but: 1) that would be very expensive and would bloat the mbuf structure a LOT; 2) we would probably run into lock order reversal problems. I see now what Alfred meant when he made his original comment. -- Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, Jul 12, 2002 at 07:45:07AM -0400, Bosko Milekic wrote: > > [ ... Description of modifying a bidrectional ring ... ] > > So I guess that what we're dealing with isn't really a > "monodirectional" ring. Right? Yep. =) -- Jonathan Mini <[EMAIL PROTECTED]> http://www.freebsd.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On 2002-07-11 17:12 +, Bosko Milekic wrote: > On Thu, Jul 11, 2002 at 01:56:08PM -0700, Luigi Rizzo wrote: > > example: userland does an 8KB write, in the old case this requires > > 4 clusters, with the new one you end up using 4 clusters and stuff > > the remaining 16 bytes in a regular mbuf, then depending on the > > relative producer-consumer speed the next write will try to fill > > the mbuf and attach a new cluster, and so on... and when TCP hits > > these data-in-mbuf blocks will have to copy rather than reference > > the data blocks... > > This is a good observation if we're going to be doing benchmarking, > but I'm not sure whether the repercussions are that important (unless, > as I said, there's a lot of applications that send exactly 8192 > byte chunks?). This is not true only for 8192 byte-sized writes. Anything that uses a block size >2048 near a power of 2 will have the same problem. Writes that use 2048 bytes, 4096, 8192, 16384, ... will all have this very same problem :/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, Jul 12, 2002 at 04:26:53AM -0700, Jon Mini wrote: > On Thu, Jul 11, 2002 at 11:41:04PM -0700, Alfred Perlstein wrote: > > > That's a cool idea.. haven't looked at NetBSD but am imagining the > > > mbufs would be linked in a 'ring'. This works because you never > > > care how many references are, just whether there's one or more than > > > one, and this is easy to tell by examining the ring pointer. > > > I.e., you never have to iterate through the entire ring. > > > > That's true, but could someone explain how one can safely and > > effeciently manipulate such a structure in an SMP environment? > > > > I'm not saying it's impossible, I'm just saying it didn't seem > > intuative to me back then, as well as now. > > I'm probably speaking out of turn here (I have no idea what structure you > all are talking about), but a monodirectional ring can be safely modified > with a compare-and-exchange atomic operation. The jist of the problem is that when you want to say, remove yourself from the list, you have to: 1) your "next"'s back pointer to your "back" pointer 2) your "Prev"'s next pointer to your "next" pointer So that's two operations but for all you know your "next" or your "back" may be doing the same thing to you at the same time. As far as I know, you can't (intuitively) figure out a way to do both of these atomically. i.e., maybe you'll set your next's back pointer to whatever you have in `back' but then your `back' guy will set your back pointer to whatever he has in `back' and then your next guy's back pointer will be invalid, for example. So I guess that what we're dealing with isn't really a "monodirectional" ring. Right? > -- > Jonathan Mini <[EMAIL PROTECTED]> > http://www.freebsd.org/ Regards, -- Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On 2002-07-12 07:45 +, Bosko Milekic wrote: > The jist of the problem is that when you want to say, remove yourself > from the list, you have to: > > 1) your "next"'s back pointer to your "back" pointer > 2) your "Prev"'s next pointer to your "next" pointer > > So that's two operations but for all you know your "next" or your > "back" may be doing the same thing to you at the same time. As far as > I know, you can't (intuitively) figure out a way to do both of these > atomically. i.e., maybe you'll set your next's back pointer to whatever > you have in `back' but then your `back' guy will set your back pointer > to whatever he has in `back' and then your next guy's back pointer will > be invalid, for example. > > So I guess that what we're dealing with isn't really a > "monodirectional" ring. Right? No it isn't. It looks more like the "dining philosophers" problem. But that problem's solution would require at least one mutex for every part of the ring :-( To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
RE: xl checksum and dsniff
> -Original Message- > From: Jonathan Lemon [mailto:[EMAIL PROTECTED]] >> > > >My guess is that doing hw checksum by the nic could be the > issue. This is > >the only real difference I can see at present. > > > >Any ideas? > > Test your theory. Turn off hardware checksums with 'ifconfig > xl0 -txcsum' When I do 'ifconfig xl0 -txcsum', a subsequent 'ifconfig' reads as if the command had no effect. In other words, ifconfig shows options=3 still. Using tcpdump, it also still reports 'bad checksum' even though everything works fine. The man page for xl also doesn't show these commands. Perhaps they are not turned on yet? On a similar machine, running OpenBSD 3.0, dsniff works just fine. This machine doesn't have support for checksum offload (or at least, ifconfig xl0 doesn't indicate it.) MikeC To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Question about network layers in FreeBSD 4.x
I have a system I run FreeBSD 4.5-release on. The purpose of this system is to run Snort (IDS). The current system is a Compaq Proliant 1850R, have also tried on a Compaq Proliant 1600R. Both systems are SMP with dual processors, > 256m ram, and Compaq Smart Array controller to handle raid in hardware. I want to use this box to monitor multiple lan segments. So I use the builtin tlan eth for mgmt, and than add other nics with no IP addresses for snort to listen on. This works great when I use distinct multiple NIC cards. 3com + Intel + Realtek. However, when I try to use a quad ethernet card, it fails. The programs don't bomb, no errors reported. But there is amount of activity that doesn't get picked up when using the quad cards vs. when using the multiple NICs scenario. For example, if someone in lan segment x.x.a.x connects to a *nix server in x.x.b.x (both monitored by this box), and a suspicious event occurs I will see it captured by both of the snort interfaces. If, however, I put in the quad card, and the same thing happens, it will only be seen/recorded by one of the snort nic instances. I have tried this with a Znyx ZX346Q and with an Adaptec quad card. With the Znyx I tried both the default freebsd drivers it sees that card as and also with the Znyx drivers. This seems to be a problem somewhere other than in the NIC driver itself. Any suggestions or insight into what might be wrong here would be greatly appreciated. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: xl checksum and dsniff
On Fri, Jul 12, 2002 at 09:06:13AM -0400, Cambria, Mike wrote: > > -Original Message- > > From: Jonathan Lemon [mailto:[EMAIL PROTECTED]] > >> > > > >My guess is that doing hw checksum by the nic could be the > > issue. This is > > >the only real difference I can see at present. > > > > > >Any ideas? > > > > Test your theory. Turn off hardware checksums with 'ifconfig > > xl0 -txcsum' > > When I do 'ifconfig xl0 -txcsum', a subsequent 'ifconfig' reads as if the > command had no effect. In other words, ifconfig shows > options=3 still. Oh, hmm. It appears that this driver doesn't support disabling checksums. For the time being, you can recompile the driver, and manually disable the checksums by editing the define at the top of the file: #define XL905B_CSUM_FEATURES0 -- Jonathan To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: xl checksum and dsniff
Actually, I seem to remember that the ifconfig output only shows the driver's capabilities, not the actual setting. cheers luigi On Fri, Jul 12, 2002 at 12:00:48PM -0500, Jonathan Lemon wrote: > > On Fri, Jul 12, 2002 at 09:06:13AM -0400, Cambria, Mike wrote: > > > -Original Message- > > > From: Jonathan Lemon [mailto:[EMAIL PROTECTED]] > > >> > > > > >My guess is that doing hw checksum by the nic could be the > > > issue. This is > > > >the only real difference I can see at present. > > > > > > > >Any ideas? > > > > > > Test your theory. Turn off hardware checksums with 'ifconfig > > > xl0 -txcsum' > > > > When I do 'ifconfig xl0 -txcsum', a subsequent 'ifconfig' reads as if the > > command had no effect. In other words, ifconfig shows > > options=3 still. > > Oh, hmm. It appears that this driver doesn't support disabling checksums. > For the time being, you can recompile the driver, and manually disable the > checksums by editing the define at the top of the file: > > #define XL905B_CSUM_FEATURES0 > -- > Jonathan > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-net" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: xl checksum and dsniff
No - ifconfig shows the actual settings. 'ifconfig -m' will show both the configured settings and the driver capability list. -- Jonathan On Fri, Jul 12, 2002 at 10:43:24AM -0700, Luigi Rizzo wrote: > Actually, I seem to remember that the ifconfig output only shows > the driver's capabilities, not the actual setting. > > cheers > luigi > > On Fri, Jul 12, 2002 at 12:00:48PM -0500, Jonathan Lemon wrote: > > > > On Fri, Jul 12, 2002 at 09:06:13AM -0400, Cambria, Mike wrote: > > > > -Original Message- > > > > From: Jonathan Lemon [mailto:[EMAIL PROTECTED]] > > > >> > > > > > >My guess is that doing hw checksum by the nic could be the > > > > issue. This is > > > > >the only real difference I can see at present. > > > > > > > > > >Any ideas? > > > > > > > > Test your theory. Turn off hardware checksums with 'ifconfig > > > > xl0 -txcsum' > > > > > > When I do 'ifconfig xl0 -txcsum', a subsequent 'ifconfig' reads as if the > > > command had no effect. In other words, ifconfig shows > > > options=3 still. > > > > Oh, hmm. It appears that this driver doesn't support disabling checksums. > > For the time being, you can recompile the driver, and manually disable the > > checksums by editing the define at the top of the file: > > > > #define XL905B_CSUM_FEATURES0 > > -- > > Jonathan > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > > with "unsubscribe freebsd-net" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: Question about network layers in FreeBSD 4.x
freebsd wrote: > > I have a system I run FreeBSD 4.5-release on. The purpose of this system is > to run Snort (IDS). > > The current system is a Compaq Proliant 1850R, have also tried on a Compaq > Proliant 1600R. > > Both systems are SMP with dual processors, > 256m ram, and Compaq Smart Array > controller to handle raid in hardware. > FreeBSD 4.x (did-you notice 4.6 has been released ?) is not very good at using SMP machines where there are lots of interrupts (the kernel can only be run by one CPU at any one time, and this is enforced by a "Big Giant Lock"). you should re-run your test without the SMP option, to see it the problem is still here (it should not) then, there are kernel options in recent versions of FreeBSD enabling an optimized use of the interrupts (DEVICE POLLING). this may help you, if the driver has been modified. I used a cheap 4-port NIC from DLINK (DFE-570-TX) with very good success (this is the dc driver) Hope this helps TfH To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
In article <[EMAIL PROTECTED]>, Bosko Milekic <[EMAIL PROTECTED]> wrote: > > Right now, in -CURRENT, there is this hack that I introduced that > basically just allocates a ref. counter for external buffers attached > to mbufs with malloc(9). What this means is that if you do something > like allocate an mbuf and then a cluster, there's a malloc() call that > is made to allocate a small (usually 4-byte) reference counter for it. > > That sucks, Eeek, it sure does! > and even -STABLE doesn't do this. I changed it this way > a long time ago for simplicity's sake and since then I've been meaning > to do something better here. The idea was, for mbuf CLUSTERS, to > stash the counter at the end of the 2K buffer area, and to make > MCLBYTES = 2048 - sizeof(refcount), which should be more than enough, > theoretically, for all cluster users. This is by far the easiest > solution (I had it implemented about 10 months ago) and it worked > great. > > The purpose of this Email is to find out if anyone has concrete > information on why this wouldn't work (if they think it wouldn't). I've been out of town and I realize I'm coming into this thread late and that it has evolved a bit. But I still think it's worthwhile to point out a very big problem with the idea of putting the reference count at the end of each mbuf cluster. It would have disastrous consequences for performance because of cache effects. Bear with me through a little bit of arithmetic. Consider a typical PIII CPU that has a 256 kbyte 4-way set-associative L2 cache with 32-byte cache lines. 4-way means that there are 4 different cache lines associated with each address. Each group of 4 is called a set, and each set covers 32 bytes of the address space (the cache line size). The total number of sets is: 256 kbytes / 32 bytes per line / 4 lines per set = 2048 sets and as mentioned above, each set covers 32 bytes. The cache wraps around every 256 kbytes / 4-way = 64 kbytes of address space. In other words, if address N maps onto a given set, then addresses N + 64k, N + 128k, etc. all map onto the same set. An mbuf cluster is 2 kbytes and all mbuf clusters are well-aligned. So the wrap around of the cache occurs every 64 kbytes / 2 kbytes per cluster = 32 clusters. To put it another way, all of the reference counts would be sharing (i.e., competing for) the same 32 cache sets and they would never utilize the remaining 2061 sets at all. Only 1.56% of the cache (32 sets / 2048 sets) would be usable for the reference counts. This means there would be a lot of cache misses as reference count updates caused other reference counts to be flushed from the cache. These cache effects are huge, and they are growing all the time as CPU speeds increase while RAM speeds remain relatively constant. It is much better to have the reference counts laid out as they are in -stable, i.e., one big contiguous block of counts. That way, the counts are spread out through the entire cache and they don't compete with each other nearly so much. That is the underlying principle of slab allocators, by the way. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, 12 Jul 2002, Giorgos Keramidas wrote: > On 2002-07-12 07:45 +, Bosko Milekic wrote: > > > > So I guess that what we're dealing with isn't really a > > "monodirectional" ring. Right? > > No it isn't. It looks more like the "dining philosophers" problem. > But that problem's solution would require at least one mutex for every > part of the ring :-( Te stuff under consideration originally came from OSF/1 which became true-64 that was heavily SMP can anyone find out what they did? > > To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, Jul 12, 2002 at 11:03:45AM -0700, John Polstra wrote: > I've been out of town and I realize I'm coming into this thread late > and that it has evolved a bit. But I still think it's worthwhile to > point out a very big problem with the idea of putting the reference > count at the end of each mbuf cluster. It would have disastrous > consequences for performance because of cache effects. Bear with me > through a little bit of arithmetic. > > Consider a typical PIII CPU that has a 256 kbyte 4-way set-associative > L2 cache with 32-byte cache lines. 4-way means that there are 4 > different cache lines associated with each address. Each group of 4 > is called a set, and each set covers 32 bytes of the address space > (the cache line size). > > The total number of sets is: > > 256 kbytes / 32 bytes per line / 4 lines per set = 2048 sets > > and as mentioned above, each set covers 32 bytes. > > The cache wraps around every 256 kbytes / 4-way = 64 kbytes of address > space. In other words, if address N maps onto a given set, then > addresses N + 64k, N + 128k, etc. all map onto the same set. > > An mbuf cluster is 2 kbytes and all mbuf clusters are well-aligned. > So the wrap around of the cache occurs every 64 kbytes / 2 kbytes per > cluster = 32 clusters. To put it another way, all of the reference > counts would be sharing (i.e., competing for) the same 32 cache sets > and they would never utilize the remaining 2061 sets at all. Only > 1.56% of the cache (32 sets / 2048 sets) would be usable for the > reference counts. This means there would be a lot of cache misses as > reference count updates caused other reference counts to be flushed > from the cache. > > These cache effects are huge, and they are growing all the time as CPU > speeds increase while RAM speeds remain relatively constant. I've thought about the cache issue with regards to the ref. counts before, actually, and initially, I also thought the exact same thing as you bring up here. However, there are a few things you need to remember: 1) SMP; counters are typically referenced by several different threads which may be running on different CPUs at any given point in time, and this means that we'll probably end up having corresponding cache lines invalidated back and forth anyway; 2) Using more cache lines may not be better overall, we may be doing write-backs of other data already there; in any case, we would really have to measure this; 3) By far the most important: all modifications to the ref. count are atomic, bus-locked, ops. I spoke to Peter a little about this and although I'm not 100% sure, we think that bus-locked fetch-inc/dec-stores need the bus anyway. If that's the case, then we really don't care about whether or not they get cached, right? > John > -- > John Polstra > John D. Polstra & Co., Inc.Seattle, Washington USA > "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa Thanks for the cool infos. and feedback. Regards, -- Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
RE: xl checksum and dsniff
> #define XL905B_CSUM_FEATURES0 This worked. dsniff is behaving just fine now. Next I'll try to track down if this is this a libnet problem, libnids problem or dsniff problem, so I know which project I need to inform. Thanks, MikeC To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
RE: xl checksum and dsniff
On Fri, 12 Jul 2002, Cambria, Mike wrote: : :> #define XL905B_CSUM_FEATURES0 : :This worked. dsniff is behaving just fine now. : :Next I'll try to track down if this is this a libnet problem, libnids :problem or dsniff problem, so I know which project I need to inform. IIRC, the problem is BPF b/c it doesn't know the checksum since the calculation was offloaded, no? -- Andrew R. Reiter [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
RE: xl checksum and dsniff
> -Original Message- > From: Andrew R. Reiter [mailto:[EMAIL PROTECTED]] > :Next I'll try to track down if this is this a libnet problem, libnids > :problem or dsniff problem, so I know which project I need to inform. > > IIRC, the problem is BPF b/c it doesn't know the checksum since the > calculation was offloaded, no? Possibly, or perhaps libpcap? Now that I know checksum offload is indeed involved, I booted the original kernel and poked around. Using dsniff -c, dsniff was able to see packets received just fine. The half of the session sent is what dsniff can't track. Packets received, although tcpdump shows "bad checksum", are seen by dsniff just fine. I expected it to be the other way around. MikeC To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
In article <[EMAIL PROTECTED]>, Bosko Milekic <[EMAIL PROTECTED]> wrote: > > I've thought about the cache issue with regards to the ref. counts > before, actually, and initially, I also thought the exact same thing > as you bring up here. However, there are a few things you need to > remember: > > 1) SMP; counters are typically referenced by several different threads > which may be running on different CPUs at any given point in time, and > this means that we'll probably end up having corresponding cache lines > invalidated back and forth anyway; Agreed. The PII and newer CPUs do have some short cuts built in that mitigate this somewhat by doing direct cache-to-cache updates in the SMP case. But quantitatively I don't know how much that helps. > 2) Using more cache lines may not be better overall, we may be doing > write-backs of other data already there; in any case, we would really > have to measure this; The research that led to the slab allocator demonstrated pretty conclusively that, at least in general, it's better to spread out the usage across all cache lines rather than compete for just a few. Measurements trump research, though, as long as the measurements reflect real-world usage patterns. If you decide to pack the refcounts into the clusters themselves, it might be better to put the recount at the front of each cluster, and offset the packet data by 16 bytes to make room for it. That way, the reference count would be in the same cache line as the first part of the packet header -- a cache line which is almost certain to be accessed (though probably not dirtied) anyway. > 3) By far the most important: all modifications to the ref. count are > atomic, bus-locked, ops. I spoke to Peter a little about this and > although I'm not 100% sure, we think that bus-locked > fetch-inc/dec-stores need the bus anyway. If that's the case, > then we really don't care about whether or not they get cached, right? I'm afraid I don't know the answer to that. The majority of systems will be uniprocessor for a good long time, and I would hate to see their performance sacrificed needlessly. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
Julian Elischer writes: > > > On Fri, 12 Jul 2002, Giorgos Keramidas wrote: > > > On 2002-07-12 07:45 +, Bosko Milekic wrote: > > > > > > So I guess that what we're dealing with isn't really a > > > "monodirectional" ring. Right? > > > > No it isn't. It looks more like the "dining philosophers" problem. > > But that problem's solution would require at least one mutex for every > > part of the ring :-( > > Te stuff under consideration originally came from OSF/1 which became > true-64 > > that was heavily SMP > can anyone find out what they did? >From looking at a Tru64 5.1 header file, it looks like they do per-ext locking and declare an MBUF_EXT_LOCK(m) macro. It is not clear how one is supposed to use this & it appears to be undocumented. Tru64 also has a global mbuf lock. Tru64 4.x does not appear to have the MBUF_EXT_LOCK (so I think it uses just the global MBUF_LOCK for all mbuf manipulations; and I'll bet that just does a 'splimp' on UP systems). AIX also has this nice ext_refq structure and it also appears to be doing per-ext locking. From mbuf.h, AIX's ext mbufs are all just malloc'ed memory. This jives with the pain & suffering I had when writing an ethernet driver for AIX & finding mbuf's which cross page boundaries. MacOS-X seems to have both a refq and a refcnt array like in -stable. It appears to use the refq for externally managed data and the refcnt for system clusters. As for locking, it looks a lot like Tru64 4.x -- it has a global mbuf lock. Perhaps this is what the original Mach did? WRT to using refqs -- I think that Bosko's system in -current is just as nice from a user's perspective, and if we can work out an acceptable solution for doing refcnts, lets not revert to refqs. I agree with John about where to put the refcnts: I think we should have a big hunk of memory for the refcnts like in -stable. My understanding is that the larger virtually contig mbufs are the only thing that would cause a problem for this, or is that incorrect? If so, then why not just put their counter elsewhere? One concrete example against putting the refcnts into the cluster is that it would cause NFS servers & clients to use 25% more mbufs for a typical 8K read or write request. Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
On Fri, Jul 12, 2002 at 06:55:37PM -0400, Andrew Gallatin wrote: [...] FWIW, BSD/OS also does similar to -STABLE. [...] > I agree with John about where to put the refcnts: I think we should > have a big hunk of memory for the refcnts like in -stable. My > understanding is that the larger virtually contig mbufs are the only > thing that would cause a problem for this, or is that incorrect? > If so, then why not just put their counter elsewhere? > > One concrete example against putting the refcnts into the cluster is > that it would cause NFS servers & clients to use 25% more mbufs for a > typical 8K read or write request. If we decide to allocate jumbo bufs from their own seperate map as well then we have no wastage for the counters for clusters if we keep them in a few pages, like in -STABLE, and it should all work out fine. For the jumbo bufs I still maintain that we should keep the counter for them at the end of the buf because the math works out (see my post in that thread with the math example) and because their total size is not a power of 2 anyway. They'll also be more randomly spread out and use more cache slots. > Drew Regards, -- Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: mbuf external buffer reference counters
Bosko Milekic writes: <...> > If we decide to allocate jumbo bufs from their own seperate map as > well then we have no wastage for the counters for clusters if we keep > them in a few pages, like in -STABLE, and it should all work out fine. That sounds good. > For the jumbo bufs I still maintain that we should keep the counter > for them at the end of the buf because the math works out (see my post > in that thread with the math example) and because their total size is > not a power of 2 anyway. They'll also be more randomly spread out and > use more cache slots. How about, as (I think it was) John suggested, putting the counters at the front of the buffer so they'd be close to the headers, etc in the cache and would be less likely to cause their own unique cache miss when you access them? Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
RE: mbuf external buffer reference counters
> Julian Elischer writes: > > > > > > Te stuff under consideration originally came from OSF/1 which became > > true-64 > > > > that was heavily SMP > > can anyone find out what they did? > > From looking at a Tru64 5.1 header file, it looks like they do per-ext > locking and declare an MBUF_EXT_LOCK(m) macro. It is not clear how > one is supposed to use this & it appears to be undocumented. Tru64 > also has a global mbuf lock. Tru64 4.x does not appear to have the > MBUF_EXT_LOCK (so I think it uses just the global MBUF_LOCK for all > mbuf manipulations; and I'll bet that just does a 'splimp' on UP > systems). > When I was at Hitachi in Watltham, MA. we did a port of OSF/1 to Hitachi's SR8000 Super. http://www.hitachi-eu.com/hel/hpcc/ It is based on a 64 bit implementation of the Power PC. My only involvement was with the Large File System. and I really don't remember how cluster bufs ref count was implemented. Most of the people who may have been involved are at Egenera, with the code somewhere at Hitachi in Japan. If you have any contact at either place, you might check with them. Jim To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message
Re: Masquerade fails to suppress X-sender
On Thu, Jul 11, 2002 at 01:30:53AM +0200, Julian Stacey wrote: > Hi [EMAIL PROTECTED] > Since I gave my FreeBSD-4.5-Release gateway a new sendmail.cf today, > I've been getting both these in my headers: > Received: from jhs.muc.de (520006753247-0001@[217.235.121.155]) >by fmrl11.sul.t-online.com with esmtp id >17SPs5-0MzVXEC; Thu, 11 Jul 2002 00:23:41 +0200 > X-sender: [EMAIL PROTECTED] > I never used to have 520006753247 appear, (I've confirmed that by > inspecting my morning's post to a simple expoder list (that leaves > headers unchanged), which came back clean without any 520006753247) > ( Reason I don't want people to see 520006753247-0001 is that's my > account, & while not private as such, no need ot publicise, > & dont want people emailing me (or spamming!) there either ). > > So I'd like to kill off that number from appearing, any idea how to do it ? The '-f' option of sendmail(8) would do this. See also the "trusted user" options for your sendmail.mc. I am not aware of away to set up a fake user in the sendmail.{mc,cf} files, but that does not mean there isn't one. -- Crist J. Clark | [EMAIL PROTECTED] | [EMAIL PROTECTED] http://people.freebsd.org/~cjc/| [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message