Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-18 Thread Evgeniy Polyakov
On Thu, Aug 17, 2006 at 04:24:26PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > >Feel free to implement any receiving policy inside _separated_ allocator > >to meet your needs, but if allocator depends on main system's memory > >conditions it is always possible that it will fail to make for

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Daniel Phillips
Evgeniy Polyakov wrote: On Thu, Aug 17, 2006 at 09:15:14PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: I got openssh as example of situation when system does not know in advance, what sockets must be marked as critical. OpenSSH works with network and unix sockets in parallel, so you need

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Evgeniy Polyakov
On Thu, Aug 17, 2006 at 09:15:14PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > > I got openssh as example of situation when system does not know in > > advance, what sockets must be marked as critical. > > OpenSSH works with network and unix sockets in parallel, so you need to > > hack ope

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Peter Zijlstra
On Thu, 2006-08-17 at 22:42 +0400, Evgeniy Polyakov wrote: > On Thu, Aug 17, 2006 at 11:01:52AM -0700, Daniel Phillips ([EMAIL PROTECTED]) > wrote: > > *** The system is not OOM, it is in reclaim, a transient condition *** > > It does not matter how condition when not every user can get memory is

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Evgeniy Polyakov
On Thu, Aug 17, 2006 at 11:01:52AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > *** The system is not OOM, it is in reclaim, a transient condition *** It does not matter how condition when not every user can get memory is called. And actually no one can know in advance how long it will be.

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Daniel Phillips
Evgeniy Polyakov wrote: Just for clarification - it will be completely impossible to login using openssh or some other priveledge separation protocol to the machine due to the nature of unix sockets. So you will be unable to manage your storage system just because it is in OOM - it is not what i

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-16 Thread Evgeniy Polyakov
On Wed, Aug 16, 2006 at 09:48:37PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > >On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips > >([EMAIL PROTECTED]) wrote: > >>Indeed. The rest of the corner cases like netfilter, layered protocol and > >>so on need t

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-16 Thread Daniel Phillips
Evgeniy Polyakov wrote: On Mon, Aug 14, 2006 at 08:45:43AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: Just pure openssh for control connection (admin should be able to login). These periods of degenerated functionality should be short and infrequent albeit critical for machine recovery.

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-16 Thread Daniel Phillips
Evgeniy Polyakov wrote: On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: Indeed. The rest of the corner cases like netfilter, layered protocol and so on need to be handled, however they do not need to be handled right now in order to make remote storage on a

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-15 Thread Pavel Machek
Hi! > Recently, Peter Zijlstra and I have been busily collaborating on a > solution to the memory deadlock problem described here: > >http://lwn.net/Articles/144273/ >"Kernel Summit 2005: Convergence of network and storage paths" > > We believe that an approach very much like today's pat

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Evgeniy Polyakov
On Mon, Aug 14, 2006 at 08:45:43AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > > Just for clarification - it will be completely impossible to login using > > openssh or some other priveledge separation protocol to the machine due > > to the nature of unix sockets. So you will be unable to

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Peter Zijlstra
On Mon, 2006-08-14 at 09:13 +0400, Evgeniy Polyakov wrote: > On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips ([EMAIL PROTECTED]) > wrote: > > Indeed. The rest of the corner cases like netfilter, layered protocol and > > so on need to be handled, however they do not need to be handled ri

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Evgeniy Polyakov
On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > Indeed. The rest of the corner cases like netfilter, layered protocol and > so on need to be handled, however they do not need to be handled right now > in order to make remote storage on a lan work properly.

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Daniel Phillips
Evgeniy Polyakov wrote: One must receive a packet to determine if that packet must be dropped until tricky hardware with header split capabilities or MMIO copying is used. Peter uses special pool to get data from when system is in OOM (at least in his latest patchset), so allocations are separate

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Daniel Phillips
Peter Zijlstra wrote: On Wed, 2006-08-09 at 16:54 -0700, David Miller wrote: People are doing I/O over IP exactly for it's ubiquity and flexibility. It seems a major limitation of the design if you cancel out major components of this flexibility. We're not, that was a bit of my own frustratio

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Evgeniy Polyakov
On Sun, Aug 13, 2006 at 01:06:21PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: > On Sat, Aug 12, 2006 at 05:46:07PM -0700, David Miller ([EMAIL PROTECTED]) > wrote: > > From: Evgeniy Polyakov <[EMAIL PROTECTED]> > > Date: Sat, 12 Aug 2006 13:37:06 +0400 > > > > > Does it? I though it is p

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 05:46:07PM -0700, David Miller ([EMAIL PROTECTED]) wrote: > From: Evgeniy Polyakov <[EMAIL PROTECTED]> > Date: Sat, 12 Aug 2006 13:37:06 +0400 > > > Does it? I though it is possible to only have 64k of working sockets per > > device in TCP. > > Where does this limit come

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Rik van Riel
David Miller wrote: From: Peter Zijlstra <[EMAIL PROTECTED]> Date: Sat, 12 Aug 2006 12:18:07 +0200 65535 sockets * 128 packets * 16384 bytes/packet = 1^16 * 1^7 * 1^14 = 1^(16+7+14) = 1^37 = 128G of memory per IP And systems with a lot of IP numbers are not unthinkable. TCP restricts the am

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread David Miller
From: Evgeniy Polyakov <[EMAIL PROTECTED]> Date: Sat, 12 Aug 2006 13:37:06 +0400 > Does it? I though it is possible to only have 64k of working sockets per > device in TCP. Where does this limit come from? You think there is something magic about 64K local ports, but if remote IP addresses in th

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread David Miller
From: Peter Zijlstra <[EMAIL PROTECTED]> Date: Sat, 12 Aug 2006 12:18:07 +0200 > 65535 sockets * 128 packets * 16384 bytes/packet = > 1^16 * 1^7 * 1^14 = 1^(16+7+14) = 1^37 = 128G of memory per IP > > And systems with a lot of IP numbers are not unthinkable. TCP restricts the amount of global m

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Peter Zijlstra
On Sat, 2006-08-12 at 19:08 +0400, Evgeniy Polyakov wrote: > One must receive a packet to determine if that packet must be dropped > until tricky hardware with header split capabilities or MMIO copying is > used. True, that is done, but we then discard this packet at the very first moment we kno

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 10:56:31AM -0400, Rik van Riel ([EMAIL PROTECTED]) wrote: > >Yep. Socket allocations end up with alloc_skb() which is essentialy the > >same as what is being done for receiving path skbs. > >If you really want to separate critical from non-critical sockets, it is > >much be

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Rik van Riel
Evgeniy Polyakov wrote: On Sat, Aug 12, 2006 at 10:40:23AM -0400, Rik van Riel ([EMAIL PROTECTED]) wrote: Evgeniy Polyakov wrote: On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: As you described above, memory for each packet must be allocated (either >from

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 10:40:23AM -0400, Rik van Riel ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > >On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra > >([EMAIL PROTECTED]) wrote: > >>>As you described above, memory for each packet must be allocated (either > >>>from SLAB or fro

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Rik van Riel
Evgeniy Polyakov wrote: On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: As you described above, memory for each packet must be allocated (either from SLAB or from reserve), so network needs special allocator in OOM condition, and that allocator should be sepa

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 01:40:29PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > On Sat, 2006-08-12 at 14:42 +0400, Evgeniy Polyakov wrote: > > > When network uses the same allocator, it depends on it, and thus it is > > possible to have (cut by you) a situation when reserve (which depends o

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Peter Zijlstra
On Sat, 2006-08-12 at 14:42 +0400, Evgeniy Polyakov wrote: > When network uses the same allocator, it depends on it, and thus it is > possible to have (cut by you) a situation when reserve (which depends on > SLAB and it's OOM too) is not filled or even does not exist. No, the reserve does not de

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 02:42:26PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: > > Hence the alternative allocator to use on tight memory conditions. > > If transferred to your implementation, then just steal some pages from > SLAB when new network device is added and use them when OOM hap

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 12:18:07PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > > Does it? I though it is possible to only have 64k of working sockets per > > device in TCP. > > 65535 sockets * 128 packets * 16384 bytes/packet = > 1^16 * 1^7 * 1^14 = 1^(16+7+14) = 1^37 = 128G of memory per

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Peter Zijlstra
On Sat, 2006-08-12 at 13:37 +0400, Evgeniy Polyakov wrote: > On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) > wrote: > > > As you described above, memory for each packet must be allocated (either > > > from SLAB or from reserve), so network needs special allocator in

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > > As you described above, memory for each packet must be allocated (either > > from SLAB or from reserve), so network needs special allocator in OOM > > condition, and that allocator should be separated from SLAB

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Peter Zijlstra
On Sat, 2006-08-12 at 12:47 +0400, Evgeniy Polyakov wrote: > On Fri, Aug 11, 2006 at 11:42:50PM -0400, Rik van Riel ([EMAIL PROTECTED]) > wrote: > > >Dropping these non-essential packets makes sure the reserve memory > > >doesn't get stuck in some random blocked user-space process, hence > > >you

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-12 Thread Evgeniy Polyakov
On Fri, Aug 11, 2006 at 11:42:50PM -0400, Rik van Riel ([EMAIL PROTECTED]) wrote: > >Dropping these non-essential packets makes sure the reserve memory > >doesn't get stuck in some random blocked user-space process, hence > >you can make progress. > > In short: > - every incoming packet needs t

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-11 Thread Rik van Riel
Peter Zijlstra wrote: You say "critical resource isolation", but it is not the case - consider NFS over UDP - remote side will not stop sending just because receiving socket code drops data due to OOM, or IPsec or compression, which can requires reallocation. There is no "critical resource iso

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread Peter Zijlstra
On Wed, 2006-08-09 at 16:54 -0700, David Miller wrote: > From: Peter Zijlstra <[EMAIL PROTECTED]> > Date: Wed, 09 Aug 2006 15:32:33 +0200 > > > The idea is to drop all !NFS packets (or even more specific only > > keep those NFS packets that belong to the critical mount), and > > everybody doing cr

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread David Miller
From: Peter Zijlstra <[EMAIL PROTECTED]> Date: Wed, 09 Aug 2006 15:32:33 +0200 > The idea is to drop all !NFS packets (or even more specific only > keep those NFS packets that belong to the critical mount), and > everybody doing critical IO over layered networks like IPSec or > other tunnel constr

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread Evgeniy Polyakov
On Wed, Aug 09, 2006 at 03:32:33PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > > > > >http://lwn.net/Articles/144273/ > > > > >"Kernel Summit 2005: Convergence of network and storage paths" > > > > > > > > > > We believe that an approach very much like today's patch set is > > > >

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread Peter Zijlstra
On Wed, 2006-08-09 at 17:07 +0400, Evgeniy Polyakov wrote: > On Wed, Aug 09, 2006 at 02:37:20PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) > wrote: > > On Wed, 2006-08-09 at 09:46 +0400, Evgeniy Polyakov wrote: > > > On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL > > > PROTECTED]

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread Evgeniy Polyakov
On Wed, Aug 09, 2006 at 02:37:20PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: > On Wed, 2006-08-09 at 09:46 +0400, Evgeniy Polyakov wrote: > > On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL > > PROTECTED]) wrote: > > >http://lwn.net/Articles/144273/ > > >"Kernel Su

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-09 Thread Peter Zijlstra
On Wed, 2006-08-09 at 09:46 +0400, Evgeniy Polyakov wrote: > On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) > wrote: > >http://lwn.net/Articles/144273/ > >"Kernel Summit 2005: Convergence of network and storage paths" > > > > We believe that an approach very

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread David Miller
From: Daniel Phillips <[EMAIL PROTECTED]> Date: Tue, 08 Aug 2006 22:52:34 -0700 > Agreed. But probably more intrusive than davem would be happy with > at this point. I'm much more happy with Evgeniy's network tree allocator, which has a real design and well thought our long term consequences, th

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread Evgeniy Polyakov
On Tue, Aug 08, 2006 at 10:53:55PM -0700, David Miller ([EMAIL PROTECTED]) wrote: > From: Evgeniy Polyakov <[EMAIL PROTECTED]> > Date: Wed, 9 Aug 2006 09:46:48 +0400 > > > There is another approach for that - do not use slab allocator for > > network dataflow at all. It automatically has all you

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread David Miller
From: Evgeniy Polyakov <[EMAIL PROTECTED]> Date: Wed, 9 Aug 2006 09:46:48 +0400 > There is another approach for that - do not use slab allocator for > network dataflow at all. It automatically has all you pros amd if > implemented correctly can have a lot of additional usefull and > high-performan

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread Daniel Phillips
Evgeniy Polyakov wrote: On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: http://lwn.net/Articles/144273/ "Kernel Summit 2005: Convergence of network and storage paths" We believe that an approach very much like today's patch set is necessary for NBD, iSC

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread Evgeniy Polyakov
On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: >http://lwn.net/Articles/144273/ >"Kernel Summit 2005: Convergence of network and storage paths" > > We believe that an approach very much like today's patch set is > necessary for NBD, iSCSI, AoE or the l

[RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread Peter Zijlstra
From: Daniel Phillips <[EMAIL PROTECTED]> Recently, Peter Zijlstra and I have been busily collaborating on a solution to the memory deadlock problem described here: http://lwn.net/Articles/144273/ "Kernel Summit 2005: Convergence of network and storage paths" We believe that an approach v