On Sat, Dec 12, 2015 at 4:41 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > Those usages make sense to me, as they are locally self-contained and > it is clear what they are in contradistinction to. But your usage is > spread throughout (even in function names, not just comments) and > seems to contradict the current usage as yours are not separately > palloced, as the "proper" ones described here are. I think that > "proper" only works when the same comment also defines the > alternative, rather than as some file-global description. Maybe > "pooltuple" rather than "tupleproper"
I don't think of it that way. The "tuple proper" is the thing that the client passes to their tuplesort -- the thing they are actually interested in having sorted. Like an IndexTuple for CREATE INDEX callers, for example. SortTuple is just an internal implementation detail. (That appears all over the file tuplesort.c, just as my new references to "tuple proper" do. But neither appear elsewhere.) >>> Also, if I am reading this correctly, when we refill a pool from a >>> logical tape we still transform each tuple as it is read from the disk >>> format to the memory format. This inflates the size quite a bit, at >>> least for single-datum tuples. If we instead just read the disk >>> format directly into the pool, and converted them into the in-memory >>> format when each tuple came due for the merge heap, would that destroy >>> the locality of reference you are seeking to gain? >> >> Are you talking about alignment? > > Maybe alignment, but also the size of the SortTuple struct itself, > which is not present on tape but is present in memory if I understand > correctly. > > When reading 128kb (32 blocks) worth of in-memory pool, it seems like > it only gets to read 16 to 18 blocks of tape to fill them up, in the > case of building an index on single column 32-byte random md5 digests. > I don't exactly know where all of that space goes, I'm taking an > experimentalist approach. I'm confused. readtup_datum(), just like every other READTUP() variant, has the new function tupproperalloc() as a drop-in replacement for the master branch palloc() + USEMEM() calls. It is true that tupproperalloc() (and a couple of other places relating to preloading) know *a little* about the usage pattern -- tupproperalloc() accepts a "tape number" argument to know what partition within the large pool/buffer to use for each logical allocation. However, from the point of view of correctness, tupproperalloc() should function as a drop-in replacement for palloc() + USEMEM() calls in the context of the various READTUP() routines. I have done nothing special with any particular READTUP() routine, including readtup_datum() (all READTUP() routines have received the same treatment). Nothing else was changed in those routines, including how tuples are stored on tape. The datum case does kind of store the SortTuples on tape today in one very limited sense, which is that the length is stored fairly naively (that's already available from the IndexTuple in the case of writetup_index(), for example, but length must be stored explicitly for the datum case). My guess is you're confusion comes from the fact that the memtuples array (the array of SortTuple) is also factored in to memory accounting, but that grows at geometric intervals, whereas the existing READTUP() retail palloc() calls (and their USEMEM() memory accounting calls) occur in drips and drabs. It's probably the case that the sizing of the memtuples array and the amount of memory we use for that rather than retail palloc()/"tuple proper" memory is a kind of arbitrary (why should the needs be the same when SortTuples are merge step "slots"?), but I don't think that's the biggest problem in this general area at all. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers