Hi all, I have been looking at and testing the patch set for CREATE TABLE COMPRESSION, and spotted a couple of things in parallel of some work done by Jacob (added in CC).
The behavior around CREATE TABLE AS and matviews is a bit confusing, and not documented. First, at the grammar level, it is not possible to specify which compression option is used per column when creating the relation. So, all the relation columns would just set a column's compression to be default_toast_compression for all the toastable columns of the relation. That's not enforceable at column level when the relation is created, except with a follow-up ALTER TABLE. That's similar to STORAGE when it comes to matviews, but these are at least documented. And so, ALTER MATERIALIZED VIEW supports SET COMPRESSION but this is not mentioned in its docs: https://www.postgresql.org/docs/devel/sql-altermaterializedview.html psql could have tab completion support for that. There are no tests in pg_dump to make sure that some ALTER MATERIALIZED VIEW or ALTER TABLE commands are generated when the compression of a matview's or table's column is changed. This depends on the value of default_toast_compression, but that would be nice to have something, and get at least some coverage with --no-toast-compression. You would need to make the tests conditional here, for example with check_pg_config() (see for example what's done with channel binding in ssl/t/002_scram.pl). Another thing is the handling of the per-value compression that could be confusing to the user. As no materialization of the data is done for a CTAS or a matview, and the regression tests of compression.sql track that AFAIK, there can be a mix of toast values compressed with lz4 or pglz, with pg_attribute.attcompression being one or the other. Now, we don't really document any of that, and the per-column compression value would be set to default_toast_compression while the stored values may use a mix of the compression methods, depending on where the toasted values come from. If this behavior is intended, this makes me wonder in what the possibility to set the compression for a materialized view column is useful for except for a logical dump/restore? As of HEAD we'd just insert the toasted value from the origin as-is so the compression of the column has no effect at all. Another thing here is the inconsistency that this brings with pg_dump. For example, as the dumped values are decompressed, we could have values compressed with pglz at the origin, with a column using lz4 within its definition that would make everything compressed with lz4 once the values are restored. This choice may be fine, but I think that it would be good to document all that. That would be less surprising to the user. Similarly, we may want to document that COMPRESSION does not trigger a table rewrite, but that it is effective only for the new toast values inserted if a tuple is rebuilt and rewritten? Would it be better to document that pg_column_compression() returns NULL if the column is not a toastable type or if the column's value is not compressed? The flexibility with allow_system_table_mods allows one to change the compression method of catalogs, for example switching rolpassword with a SCRAM verifier large enough to be toasted would lock an access to the cluster if restarting the server without lz4 support. I shouldn't have done that but I did, and I like it :) The design used by this feature is pretty cool, as long as you don't read the compressed values, physical replication can work out of the box even across nodes that are built with or without lz4. Thanks, -- Michael
signature.asc
Description: PGP signature