On Mon, Mar 15, 2021 at 6:58 PM Andres Freund <and...@anarazel.de> wrote: > I don't particularly like PG_RETURN_HEAPTUPLEHEADER_RAW(). What is "raw" > about it? It also seems to me like there needs to at least be a > sentence or two explaining when to use which of the functions.
It seemed like the natural name to me; we use "raw" elsewhere to mean that fewer things are magically addressed on behalf of the caller, e.g. HeapTupleHeaderGetRawXmin. I'm open to suggestions, however. > I think heap_copy_tuple_as_raw_datum() should grow an assert checking > there are no external columns? Yeah, could be done. > I'm don't like that after 0002 ExecEvalRow(), ExecEvalFieldStoreForm() > contain a nearly identical copy of the same code. And > make_tuple_from_row() also is similar. It seem that there should be a > heap_form_tuple() version doing this for us? I was worried about having either a performance impact or code duplication. The actual plan where you could insert this organically is in fill_val(), which is called from heap_fill_tuple(), which is called from heap_form_tuple(). If you don't mind passing down 'int flags' or similar to all those, and having additional branches to make the behavior dependent on the flags, I'm cool with it. Or if you think we should template-ize all those functions, that'd be another way to go. But I was afraid I would get complaints about adding overhead to hot code paths. > > I'm open to being convinced that we don't need to do either of these > > things, and that the cost of iterating over all varlenas in the tuple > > is not so bad as to preclude doing things as you have them here. But, > > I'm afraid it's going to be too expensive. > > I mean, I would just define several of those places away by not caring > about tuples in a different compressino formation ending up in a > table... That behavior feels unacceptable to me from a user expectations point of view. I think there's an argument that if I update a tuple that contains a compressed datum, and I don't update that particular column, I think it would be OK to not recompress the column. But, if I insert data into a table, I as a user would expect that the compression settings for that column are going to be respected. Deciding that's optional because we don't have a good way of making it fast seems like a major cop-out, at least to me. I think from a user perspective you don't expect INSERT INTO .. SELECT FROM to create a different final state than a dump and reload, and that if we deviate from that people are gonna be unhappy. I could be wrong; maybe it's only me who would be unhappy. -- Robert Haas EDB: http://www.enterprisedb.com