On Wed, Aug 16, 2023 at 12:01:01PM +0200, Tomas Vondra wrote: > To me losing messages seems like a bad thing, but if the users are aware > of it and are fine with it ... I'm simply arguing that if we conclude > this is a durability bug, we should not leave it unfixed because it > might have performance impact.
I've been doing some digging here, and the original bdr repo posted at [1] has a concept similar to LogLogicalMessage() called LogStandbyMessage(). *All* the non-transactional code paths enforce an XLogFlush() after *each* message logged. So the original expectation seems pretty clear to me: flushes were wanted. [1]: https://github.com/2ndQuadrant/bdr >> I've e.g. used non-transactional messages for: >> >> - A non-transactional queuing system. Where sometimes one would dump a >> portion >> of tables into messages, with something like >> SELECT pg_logical_emit_message(false, 'app:<task>', to_json(r)) FROM r; >> Obviously flushing after every row would be bad. >> >> This is useful when you need to coordinate with other systems in a >> non-transactional way. E.g. letting other parts of the system know that >> files on disk (or in S3 or ...) were created/deleted, since a database >> rollback wouldn't unlink/revive the files. >> >> - Audit logging, when you want to log in a way that isn't undone by rolling >> back transaction - just flushing every pg_logical_emit_message() would >> increase the WAL flush rate many times, because instead of once per >> transaction, you'd now flush once per modified row. It'd basically make it >> impractical to use for such things. >> >> - Optimistic locking. Emitting things that need to be locked on logical >> replicas, to be able to commit on the primary. A pre-commit hook would wait >> for the WAL to be replayed sufficiently - but only once per transaction, >> not >> once per object. > > How come the possibility of losing messages is not an issue for these > use cases? I mean, surely auditors would not like that, and I guess > forgetting locks might be bad too. +1. Now I can also get why one may not want to flush every individual messages if you care only about a queue to be flushed after generating a series of them, so after sleeping on it I'm OK with the last patch I posted where one can just choose what he wants. The default, though, may be better if flush is true compared to false (the patch has kept flush at false).. I am not sure where this is leading yet, so I have registered a CF entry to keep track of that: https://commitfest.postgresql.org/44/4505/ -- Michael
signature.asc
Description: PGP signature