On Sat, Feb 13, 2021 at 8:14 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Feb 11, 2021 at 8:17 PM Robert Haas <robertmh...@gmail.com> wrote: > > > > On Thu, Feb 11, 2021 at 7:36 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > W.R.T the attached patch, In HeapTupleHeaderGetDatum, we don't even > > > attempt to detoast if there is no external field in the tuple, in POC > > > I have got rid of that check, but I think we might need to do better. > > > Maybe we can add a flag in infomask to detect whether the tuple has > > > any compressed data or not as we have for detecting the external data > > > (HEAP_HASEXTERNAL). > > > > No. This feature isn't close to being important enough to justify > > consuming an infomask bit. > > Okay, > > > I don't really see why we need it anyway. If array construction > > already categorically detoasts, why can't we do the same thing here? > > Would it really cost that much? In what case? Having compressed values > > in a record we're going to store on disk actually seems like a pretty > > dumb idea. We might end up trying to recompress something parts of > > which have already been compressed. > > > > If we refer the comments atop function "toast_flatten_tuple_to_datum" > > --------------- > * We have a general rule that Datums of container types (rows, arrays, > * ranges, etc) must not contain any external TOAST pointers. Without > * this rule, we'd have to look inside each Datum when preparing a tuple > * for storage, which would be expensive and would fail to extend cleanly > * to new sorts of container types. > * > * However, we don't want to say that tuples represented as HeapTuples > * can't contain toasted fields, so instead this routine should be called > * when such a HeapTuple is being converted into a Datum. > * > * While we're at it, we decompress any compressed fields too. This is not > * necessary for correctness, but reflects an expectation that compression > * will be more effective if applied to the whole tuple not individual > * fields. We are not so concerned about that that we want to deconstruct > * and reconstruct tuples just to get rid of compressed fields, however. > * So callers typically won't call this unless they see that the tuple has > * at least one external field. > ---------------- > > It appears that the general rule we want to follow is that while > creating the composite type we want to flatten any external pointer, > but while doing that we also decompress any compressed field with the > assumption that compressing the whole row/array will be a better idea > instead of keeping them compressed individually. However, if there > are no external toast pointers then we don't want to make an effort to > just decompress the compressed data. > > Having said that I don't think this rule is followed throughout the > code for example > > 1. "ExecEvalRow" is calling HeapTupleHeaderGetDatum only if there is > any external field and which is calling "toast_flatten_tuple_to_datum" > so this is following the rule. > 2. "ExecEvalWholeRowVar" is calling "toast_build_flattened_tuple", but > this is just flattening the external toast pointer but not doing > anything to the compressed data. > 3. "ExecEvalArrayExpr" is calling "construct_md_array", which will > detoast the attribute if attlen is -1, so this will decompress any > compressed data even though there is no external toast pointer. > > So in 1 we are following the rule but in 2 and 3 we are not. > > IMHO, for the composite data types we should make common a rule and we > should follow that everywhere. As you said it will be good if we can > always detoast any external/compressed data, that will help in getting > better compression as well as fetching the data will be faster because > we can avoid multi level detoasting/decompression. I will analyse > this further and post a patch for the same.
I have done further analysis of this issue and came up with the attached patch. So with this patch, like external toast posiners we will not allow any compressed data also in the composite types. The problem is that now we will be processing all the tuple while forming the composite type irrespective of the source of the tuple, I mean if user is directly inserting into the array type and not selecting from another table then there will not be any compressed data so now checking each field of tuple for compressed data is unnecessary but I am not sure how to distinguish between those cases. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
From 59a8520a5278c43f712315035ed2261d83e310ca Mon Sep 17 00:00:00 2001 From: Dilip Kumar <dilipkumar@localhost.localdomain> Date: Tue, 16 Feb 2021 19:24:38 +0530 Subject: [PATCH v1] Disallow compressed data inside container types Currently, we have a general rule that Datums of container types (rows, arrays, ranges, etc) must not contain any external TOAST pointers. But the rule for the compressed data is not defined and no specific rule is followed e.g. while constructing the array we decompress the comprassed field but while contructing the row in some cases we don't decompress the compressed data whereas in the other cases we onlle decompress while flattening the external toast pointers. This patch make a general rule for the compressed data i.e. we don't allow the compressed data in the container type. --- src/backend/access/common/heaptuple.c | 28 ++++++---------------------- src/backend/access/heap/heaptoast.c | 16 +++++++++------- src/backend/executor/execTuples.c | 11 ----------- src/include/access/heaptoast.h | 4 ++-- 4 files changed, 17 insertions(+), 42 deletions(-) diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 24a27e3..5c3194f 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -983,30 +983,14 @@ heap_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleDesc) Datum heap_copy_tuple_as_datum(HeapTuple tuple, TupleDesc tupleDesc) { - HeapTupleHeader td; - - /* - * If the tuple contains any external TOAST pointers, we have to inline - * those fields to meet the conventions for composite-type Datums. - */ - if (HeapTupleHasExternal(tuple)) - return toast_flatten_tuple_to_datum(tuple->t_data, - tuple->t_len, - tupleDesc); - /* - * Fast path for easy case: just make a palloc'd copy and insert the - * correct composite-Datum header fields (since those may not be set if - * the given tuple came from disk, rather than from heap_form_tuple). + * The tuple contains compressed/external TOAST pointers, so we have + * to inline those fields to meet the conventions for composite-type + * Datums. */ - td = (HeapTupleHeader) palloc(tuple->t_len); - memcpy((char *) td, (char *) tuple->t_data, tuple->t_len); - - HeapTupleHeaderSetDatumLength(td, tuple->t_len); - HeapTupleHeaderSetTypeId(td, tupleDesc->tdtypeid); - HeapTupleHeaderSetTypMod(td, tupleDesc->tdtypmod); - - return PointerGetDatum(td); + return toast_flatten_tuple_to_datum(tuple->t_data, + tuple->t_len, + tupleDesc); } /* diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c index 55bbe1d..dd162da 100644 --- a/src/backend/access/heap/heaptoast.c +++ b/src/backend/access/heap/heaptoast.c @@ -549,14 +549,15 @@ toast_flatten_tuple_to_datum(HeapTupleHeader tup, /* ---------- * toast_build_flattened_tuple - * - * Build a tuple containing no out-of-line toasted fields. - * (This does not eliminate compressed or short-header datums.) + * Build a tuple containing no compressed/out-of-line toasted fields. + * (This does not eliminate short-header datums.) * * This is essentially just like heap_form_tuple, except that it will - * expand any external-data pointers beforehand. + * expand any compressed/external-data pointers beforehand. * - * It's not very clear whether it would be preferable to decompress - * in-line compressed datums while at it. For now, we don't. + * It is not necessary to decompress the compressed data for the + * correctness, but reflects an expectation that compression will be more + * effective if applied to the whole tuple not individual fields. * ---------- */ HeapTuple @@ -589,9 +590,10 @@ toast_build_flattened_tuple(TupleDesc tupleDesc, struct varlena *new_value; new_value = (struct varlena *) DatumGetPointer(new_values[i]); - if (VARATT_IS_EXTERNAL(new_value)) + if (VARATT_IS_EXTERNAL(new_value) || + VARATT_IS_COMPRESSED(new_value)) { - new_value = detoast_external_attr(new_value); + new_value = detoast_attr(new_value); new_values[i] = PointerGetDatum(new_value); freeable_values[num_to_free++] = (Pointer) new_value; } diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 73c35df..ca7fbed 100644 --- a/src/backend/executor/execTuples.c +++ b/src/backend/executor/execTuples.c @@ -2189,13 +2189,6 @@ BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values) * memory context. Beware of code that changes context between the initial * heap_form_tuple/etc call and calling HeapTuple(Header)GetDatum. * - * For performance-critical callers, it could be worthwhile to take extra - * steps to ensure that there aren't TOAST pointers in the output of - * heap_form_tuple to begin with. It's likely however that the costs of the - * typcache lookup and tuple disassembly/reassembly are swamped by TOAST - * dereference costs, so that the benefits of such extra effort would be - * minimal. - * * XXX it would likely be better to create wrapper functions that produce * a composite Datum from the field values in one step. However, there's * enough code using the existing APIs that we couldn't get rid of this @@ -2207,10 +2200,6 @@ HeapTupleHeaderGetDatum(HeapTupleHeader tuple) Datum result; TupleDesc tupDesc; - /* No work if there are no external TOAST pointers in the tuple */ - if (!HeapTupleHeaderHasExternal(tuple)) - return PointerGetDatum(tuple); - /* Use the type data saved by heap_form_tuple to look up the rowtype */ tupDesc = lookup_rowtype_tupdesc(HeapTupleHeaderGetTypeId(tuple), HeapTupleHeaderGetTypMod(tuple)); diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h index 8b29f1a..c80af45 100644 --- a/src/include/access/heaptoast.h +++ b/src/include/access/heaptoast.h @@ -128,8 +128,8 @@ extern Datum toast_flatten_tuple_to_datum(HeapTupleHeader tup, /* ---------- * toast_build_flattened_tuple - * - * Build a tuple containing no out-of-line toasted fields. - * (This does not eliminate compressed or short-header datums.) + * Build a tuple containing no compressed/out-of-line toasted fields. + * (This does not eliminate short-header datums.) * ---------- */ extern HeapTuple toast_build_flattened_tuple(TupleDesc tupleDesc, -- 1.8.3.1