On Tue, Feb 8, 2022 3:18 AM Andres Freund <and...@anarazel.de> wrote:
> 
> On 2022-02-07 08:44:00 +0530, Amit Kapila wrote:
> > Right, and it is getting changed. We are just printing the first 200
> > characters (by using SQL [1]) from the decoded tuple so what is shown
> > in the results is the initial 200 bytes.
> 
> Ah, I knew I must have been missing something.
> 
> 
> > The complete decoded data after the patch is as follows:
> 
> Hm. I think we should change the way the strings are shortened - otherwise we
> don't really verify much in that test. Perhaps we could just replace the long
> repetitive strings with something shorter in the output?
> 
> E.g. using something like regexp_replace(data,
> '(1234567890|9876543210){200}', '\1{200}','g')
> inside the substr().
> 
> Wonder if we should deduplicate the number of different toasted strings in the
> file to something that'd allow us to have a single "redact_toast" function or
> such. There's too many different ones to have a reasonbly simple redaction
> function right now. But that's perhaps better done separately.
> 

I tried to make the output shorter using your suggestion like the following 
SQL, 
please see the attached patch, which is based on v8 patch[1].

SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', 
'\1{200}','g'), 1, 200) FROM pg_logical_slot_get_changes('regression_slot', 
NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');

Note that some strings are still longer than 200 characters even though they 
have 
been shorter, so they can't be shown entirely.

e.g.
table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' 
new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum 
toasted_col1[text]:unchanged-toast-datum toasted_col2[te

The entire string is:
table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' 
new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum 
toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}'

Maybe it's better to change the substr length to 250 to show the entire string, 
or we 
can do it as separate HEAD only improvement where we can deduplicate some of the
other long strings as well. Thoughts?

[1] 
https://www.postgresql.org/message-id/CAA4eK1L_Z_2LDwMNbGrwoO%2BFc-2Q04YORQSA9UfGUTMQpy2O1Q%40mail.gmail.com

Regards,
Tang

Attachment: improve_toast_test.diff
Description: improve_toast_test.diff

Reply via email to