Hi, On 2019-04-09 11:17:29 +0200, Dmitry Dolgov wrote: > I'm also curious about that. As far as I can see the main objection against > that was that in this case the recovery process will depend on an extension, > which could violate reliability.
I don't think that's a primary concern - although it is one. The mapping from types of records to the handler function needs to be accessible at a very early state, when the cluster isn't yet in a consistent state. So we can't just go an look into pg_am, and look up a handler function, etc - crash recovery happens much earlier than that is possible. Nor do we want the mapping of 'rmgr id' -> 'extension' to be defined in the config file, that's way too likely to be wrong. So there needs to be a different type of mapping, accessible outside the catalog. I supect we'd have to end up with something very roughly like the relmapper infrastructure. A tertiary problem is then how to identify extensions in that mapping - although I suspect just using any library name that can be passed to load_library() will be OK. > But I wonder if this argument is still valid for AM's, since the whole > data is kind of depends on it, not only the recovery. I don't buy that argument. If you have an AM that registers, using a new facility, replay routines, and then it errors out / crashes during those, there's no way to get the cluster back into a consistent state. So it's not just the one table in that AM that's gone, it's the entire cluster that's impacted. > Btw, can someone elaborate, why exactly generic_xlog is not efficient enough? > I've went through the corresponding thread, looks like generic WAL records are > bigger than normal one - is it the only reason? That's one big reason. But also, you just can't do much more than "write this block into that file" during recovery with. A lot of our replay routines intentionally do more complicated tasks. Greetings, Andres Freund