Alfred, On Sun, Nov 30, 2014 at 10:10:57AM -0800, Alfred Perlstein wrote: A> Splitting this into the mbuf layer adds a huge level of complexity where A> again, there are already completion paths in the socket layer to do A> this. I am completely confused as to why this couldn't just be done A> with the socket callback system already in place. Very open to being A> educated on this!
As said in September, I can't understand how socket buffer upcall system can be used here. It does the opposite: wakes up something in kernel when data arrives to socket. So I am also very open to being explained on how can I apply it here. A> The concept of "not filled mbufs" in a socket buffer seems absolutely A> wrong at a glance, I'm sure with some better explanation this would all A> make sense, but really am still not convinced this is at all the right A> way to go on this. I'll put a longer explanation in the end of this email. A> Does any other OS do this for any reason? Or is this just a short A> sighted hack for an experiment in sendfile? Well, we also have plans to put TLS into kernel, that would use them as well. :) There might be more consumers. Note, that sf_bufs were initially a "just a short sighted hack" for sendfile(2), and now are used in several places in kernel. A> I am really trying very hard to rationalize this change, so I will ask, A> is there something about keeping TCP windows open that you are hoping to A> accomplish that you can not otherwise do without sb_ccc and sb_acc? If A> not then why is all this stuff being stuffed into mbufs as opposed to A> using callbacks? It really seems wrong, my thoughts are "this is like A> kse for mbufs" something done with good intentions, but is complex and A> will have to be ripped out later. Am I wrong here? No, this has nothing to do with keeping TCP windows. Here is longer explanation: the new sendfile(2) is going to be non-blocking on disk. That means, that syscalls returns back to application without completing I/O, immediately. Application can do its work further. It can write(2) to the socket as well. Or run another sendfile(2) on the same socket. Probably now you see the problem. If the non-blocking sendfile doesn't put a placeholder for its data into the socket buffer, then data in the socket buffer is going to be mixed randomly. Please note, that I don't move anything to the mbuf layer, as you claim it. Neither mbuf.h, not kern_mbuf.c, uipc_mbuf.c are modified. This is a new feature held internally in the socket buffer code. Yes, the sweep of changing sb_cc to sbavail() and sbused() was large. This is the problem of socket buffers being exposed to the stack and stack lurking in the structure. If decades ago socket code was developed more self-contained, then no sweep would be needed. As I already noted in the commit message, my opinion is that socket buffers need to be made protocol dependent and more opaque. The SCTP code taught me that. As for TCP/UDP, right now our socket buffer structure supports SOCK_STREAM and SOCK_DGRAM but this is achieved through code complication, and I see no good reason to keep it so generic. Original BSD soreceive() functions was a hell, before it was divorced to soreceive_stream() and soreceive_dgram(). Splitting the sockbuf to two different types would finish that. -- Totus tuus, Glebius. _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"