On Mon, Mar 8, 2021 at 5:02 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > So now only pending point is, how do we handle the upgrade when you > are upgrading from --with-lz4 to --without-lz4 binary and a couple of > options discussed here are > a) Should we allow table creation with lz4 even if it is compiled > --without-lz4? In case of xml we always allow table creation even if > it is compiled --wthout-libxml > b) Instead of allowing this always, only allow during binary upgrade.
I think the basic answer to (a) that it doesn't matter. Suppose the user is not upgrading but just feels like creating a table that is configured to use LZ4 compression. Does it really matter whether they get the error when they create the table or when they load the data? Personally, I think it is slightly more user-friendly to give the error when they try to create the table, because the problem doesn't occur when inserting ANY row, but only when inserting rows that are wide enough that compression will occur. It's not that great to have people create a table and then find out only much later that it doesn't work. On the other hand, consistency with the way the xml data type already works seems like a fair enough argument for letting the error happen when we try to actually use the compression method. So I can't get worked up about it either way. Regarding (b), it seems to me that with this approach, we have to document that pg_upgrade from binaries that support LZ4 to binaries that don't support LZ4 is fundamentally unsafe. You might have LZ4-compressed values in your columns even if they are now set to use PGLZ, and you might have LZ4'd data inside composite values that are on disk someplace. We have no idea whether those things are true or not, and we can't prevent you from upgrading to something that makes part of your data inaccessible. Given that, if we go with this approach, I think we should expend exactly 0 code trying to making pg_upgrade pass in cases where there are LZ4 columns in the database and the new binaries don't support LZ4. Just because the user goes and gets rid of all the LZ4 columns before upgrading doesn't mean that the upgrade is safe, but if they haven't even done that much, maybe they should reconsider things a bit. Some other review comments: toast_get_compression_method() should now return char, not Oid. With this design, we can support changing the compression method on a column quite easily. It's just a hint, like the STORAGE parameter. It has no bearing on what can be present in the table, but just controls how new values are stored. It would be nice to have a way to force anything compressed with the old method to be re-compressed with the new method, but not having that doesn't preclude allowing the parameter to be changed. I am tempted to propose that we collapse compress_lz4.c and compress_pglz.c into a single file, get rid of the directory, and just have something like src/backend/access/common/toast_compression.c. The files are awfully short, and making a whole new directory for that small amount of code seems like overkill. I think the pg_dump argument should be --no-toast-compression, not --no-toast-compressions. I agree with Justin that pg_restore should have the option also. Man, it would be really nice to be able to set the default for new tables, rather than having all these places hard-coded to use DefaultCompressionMethod. Surely lotsa people are going to want to set toast_compression = lz4 in postgresql.conf and forget about it. Is there any reason not to change varattrib_4b's description of va_tcinfo that says "and flags" to instead say "and compression method"? And rename VARFLAGS_4B_C to VARCOMPRESS_4B_C? I don't know why we should call it flags when we know it's specifically compression information. You should probably have a test that involves altering the type of a varlena column to non-varlena, and the other way around, and make sure that changing integer -> text sets attcompression and doing the reverse clears it. You need to update catalogs.sgml. On the whole I don't see a whole lot to complain about here. I don't love giving up on the idea of tracking which compression methods are used where, but making that work without performance regressions seems very difficult and perhaps just outright impossible, and dealing with all the concurrency problems that introduces is a pain, too. I think accepting a feature that gives us LZ4 compression is better than rejecting it because we can't solve those problems. -- Robert Haas EDB: http://www.enterprisedb.com