On 29/06/2011 12:12 PM, Johnathan Corgan wrote:
On Tue, Jun 28, 2011 at 21:38, Marcus D. Leech <mle...@ripnet.com
<mailto:mle...@ripnet.com>> wrote:
I'll make the observation that there's just got to be a better
buffer allocation policy than the existing one. Should it
*really* take
Gigabytes of VM, and hundreds of MB of RSS to execute the
flow-graph I've attached? The largest inter-block objects--the
vectors
in front of and behind the FFT, are on the order of a few MB, and
even accounting for some ballooning factor so you can have
several such vectors "in flight", that doesn't add up to hundreds
of MB of RSS, and several GB of Virtual Size.
GNU Radio achieves its streaming performance in part due to a circular
buffer design that maps the same physical memory twice in the virtual
memory map for the process, directly adjacent to each other. This
allows work function code to avoid doing any bounds checking or
modular index arithmetic operations when reading and writing its input
and output buffers. Going "off the end" of the buffer just results in
accessing the start of the buffer which is mapped into the next block
of memory. Not only does this make the code and I/O much faster, but
the work function code gets a lot simpler.
I wonder if that assertion is generally true, or only in some cases?
Increment and test shouldn't be *that* expensive.
In order to make this strategy work, however, the buffer memory must
end on an exact multiple of the size of a single stream item. Thus,
the allocated buffer size has to be the LCM (least common multiple) of
the machine page size and the item size. The machine page size is
typically 4096 bytes, so in simple cases when the item size is a small
power of two, the minimum buffer size remains 4096 bytes. (In
practice, other things the user might ask of the runtime can increase
the size by an integer factor of this.) The usual GNU Radio buffer in
these cases is 32K, which can hold 16K shorts, 8K floats, 4K complex
floats, etc.
If your stream item size is larger than 4K, things can get ugly. The
best case is that your item size is a power of two, so the LCM will
end up being the same as the item size. Then GNU Radio can allocate
memory for however many items it needs to hold in the buffer simply as
a multiple of that. The worst case is when the item size is
relatively prime to the page size, and then LCM is the product of the
page size and the item size. Asking GNU Radio to allocate a buffer
for items of size 4097 results in a minimum buffer size of 4096*4097
or slightly over 16Mbytes. The virtual memory footprint will be twice
that.
The worst-case item is the FFT, which is typically large--sometimes as
much as 512K, but it's always a power-of-two size, and given
that it's "complex", the result should also be a power-of-two size,
since each complex-float takes 8 bytes.
I'm guessing something like this last is what is happening in your
flowgraph. I'd trace through all the blocks in your graph and their
configuration parameters, then manually calculate what the item size
is and the LCM with 4096 that results.
My recollection is that sizes get ballooned outwards based on the degree
of decimation downstream from the source block, which is
a policy I've never quite gotten my head around. The FFT in my
flow-graph might get decimated by a factor of 10 or more, which given
the policy I mentioned, might lead to chunky allocations.
_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio