On Fri, Mar 28, 2014 at 3:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Way way back in commit ae93e5fd6e8a7e2321e87d23165d9d7660cde598, > we established a coding rule that it was okay for composite Datums > to contain external (out-of-line) toasted fields, as long as such > toasting didn't go more than one level deep in any tuple. This meant > that heap_form_tuple had to go through nontrivial pushups to maintain > that invariant: each composite field has to be inspected to see if any > of its component fields are external datums. Surprisingly, no one has > complained about the cost of the lookups that are required to see > whether fields are composite in the first place. > > However, in view of today's bug report from Jan Pecek, I'm wondering > if we shouldn't rethink this. Jan pointed out that the array code was > failing to prevent composites-with-external-fields from getting into > arrays, and after a bit of looking around I'm afraid there are more such > bugs elsewhere. One example is in the planner's evaluate_expr(), which > supposes that just PG_DETOAST_DATUM() is enough to make a value safe for > long-term storage in a plan tree. Range types are making the same sort > of assumption as arrays (hm, can you have a range over a composite type? > Probably, considering we have sort operators for composites). And there > are places in the index AMs that seem to assume likewise, which is an > issue for AMs in which an indexed value could be composite. > > I think we might be better off to get rid of toast_flatten_tuple_attribute > and instead insist that composite Datums never contain any external toast > pointers in the first place. That is, places that call heap_form_tuple > to make a composite Datum (rather than a tuple that's due to be stored > to disk) would be responsible for detoasting any fields with external > values first. We could make a wrapper routine for heap_form_tuple to > take care of this rather than duplicating the code in lots of places. > > From a performance standpoint this is probably a small win. In cases > where a composite Datum is formed and ultimately saved to disk, it should > be a win, since we'd have to detoast those fields anyway, and we can avoid > the overhead of an extra disassembly and reassembly of the composite > value. If the composite value is never sent to disk, it's a bit harder > to tell: we lose if the specific field value is never extracted from the > composite, but on the other hand we win if it's extracted more than once. > In any case, adding the extra syscache lookups involved in doing > toast_flatten_tuple_attribute in lots more places isn't appealing. > > From a code correctness standpoint, the question really is whether we can > find all the places that build composite datums more easily than we can > find all the places that ought to be calling toast_flatten_tuple_attribute > and aren't. I have to admit I'm not sure about that. There seem to be > basically two places to fix in the main executor (ExecEvalRow and > ExecEvalFieldStore), and roughly half a dozen calls of heap_form_tuple in > the various PLs, but I'm not sure I've not missed some cases. > > One thing we could do to try to flush out missed cases is to remove > heap_form_tuple's setting of the tuple-Datum header fields, pushing > that functionality into the new wrapper routine. Then, any un-updated > code would generate clearly invalid composite datums, rather than only > failing in infrequent corner cases. > > Another issue is what about third-party code. There seems to be risk > either way, but it would accrue to completely different code depending > on which way we try to fix this. > > Thoughts?
Trying to follow along here. Am I correct in saying that if you make these changes any SQL level functionality (say, creating a composite type containing a large array) that used to work will continue to work? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers