> at least one location: > > When adding a new dva node into the tree, a kmem_alloc is done with > a KM_SLEEP argument. > > thus, this process thread could block waiting for memory. > > I would suggest adding a pre-allocated pool of dva nodes.
This is how the Solaris memory allocator works. It keeps pools of "pre-allocated" nodes about until memory conditions are low. > When a new dva node is needed, first check this pre-allocated > pool and allocate from their. There are two reasons why this is a really bad idea: - the system will run out of memory even sooner if people start building their own free-lists - a single freelist does not scale; at two CPUs it becomes the allocation bottleneck (I've measured and removed two such bottlenecks from Solaris 9) You might want to learn about how the Solaris memory allocator works; it pretty much works like you want, except that it is all part of the framework. And, just as in your case, it does run out some times but a private freelist does not help against that. > Why? This would eliminate a possible sleep condition if memory > is not immediately available. The pool would add a working > set of dva nodes that could be monitored. Per alloc latencies > could be amortized over a chunk allocation. That's how the Solaris memory allocator already works. Casper _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss