On Tue, Jun 16, 2026 at 2:36 PM Ashutosh Bapat <[email protected]> wrote: > > On Mon, Jun 15, 2026 at 7:23 PM Amit Kapila <[email protected]> wrote: > > > > On Mon, Jun 15, 2026 at 2:34 PM Kyotaro Horiguchi > > <[email protected]> wrote: > > > > > > At Mon, 15 Jun 2026 13:52:36 +0530, Ashutosh Sharma > > > <[email protected]> wrote in > > > > Sorry for chiming in - I may well be misunderstanding this, but here's > > > > how I'm currently thinking about it: > > > > > > > > Total transaction bytes refers to the size of decoded transactional > > > > data accumulated in the reorder buffer for a given transaction. > > > > > > > > Sent bytes (as I understand from the patch) refers to the size of the > > > > downstream output that the output plugin produces from that decoded > > > > data, after any filtering and format conversion. > > > > > > > > To illustrate: if a transaction's decoded changes occupy 550 bytes in > > > > the reorder buffer, but the output plugin filters some out and emits > > > > only 300 bytes downstream, then total transaction bytes = 550 and sent > > > > bytes = 300. Conversely, if all 550 bytes are converted into a more > > > > verbose format and emitted as 700 bytes, total transaction bytes > > > > remains 550 while sent bytes becomes 700. > > > > > > > > If I'm reading this right, since total bytes for a transaction is the > > > > baseline from which transaction-derived downstream output is produced, > > > > I wonder whether sent bytes should include only that > > > > transaction-derived downstream output, or also downstream protocol > > > > traffic such as keepalive messages, which are sent downstream but are > > > > not derived from transaction bytes in the reorder buffer. > > > > > > > > My instinct is that if sent bytes are meant to measure > > > > transaction-output throughput, keepalive messages probably shouldn't > > > > be included, since they have no basis in transaction data and might > > > > distort any comparison with total bytes. But I could be wrong - happy > > > > to be corrected! > > > > > > Thank you for the explanation. > > > > > > I think I understand the distinction you are making. However, my > > > question is one step earlier than the keepalive-message question. I am > > > wondering whether the new metric needs to be defined in terms of > > > logical-change output in the first place. > > > > > > If I understand the use case correctly, I think the discussion here is > > > primarily about relatively high-volume logical replication > > > workloads. My point is that, in that situation, I would expect the > > > amount of logical-change output and the amount of data actually sent > > > over the replication connection to show broadly similar trends. > > > > > > The latter seems easier to interpret, while still providing a useful > > > signal for monitoring and capacity-planning purposes. It also seems > > > more intuitive, since it corresponds directly to the amount of data > > > sent over the replication connection. > > > > > > > BTW, the patch internally counts other protocol messages like START > > STREAM/STOP STREAM, BEGIN/END, and quite a few others that help apply > > workers to understand transaction boundaries and messages. So, I feel > > in that sense we are already counting protocol bytes as part of patch, > > so why leave the additional messages that are being discussed here. > > Those are logical message types which are part of the logical change > data - without those messages it's not possible to process the logical > change data. So they are included. But the keepalive messages, for > example, aren't part of the logical change data. >
I think it would be worth capturing this distinction clearly in the user documentation. Thanks Amit for raising this point. -- With Regards, Ashutosh Sharma.
