Le 03/02/2026 à 20:27, Rusty Conover a écrit :
Hi Antoine,

It is nice to hear from you!

Ditto :-)

On the face of it, this looks like a reasonable idea, though I wonder if
it should be a separate message type *or* an optional field carried
together in RecordBatches.

The main issue with carrying this in RecordBatch metadata is ordering. While 
IPC already supports `custom_metadata` via `write_batch` (which I’ve been 
using), that approach assumes the application data can be attached to a 
specific batch.

In some cases, the application data and record batches are produced 
independently and cannot be cleanly associated. A concrete example is 
interleaving stderr output (arbitrary log messages) with record batches written 
to stdout, while preserving a single ordered IPC stream.

I experimented with using zero-row record batches as a workaround, but this is 
inefficient: even with no rows, the serialized message size grows with schema 
complexity.

Ok, perhaps we can find a generic solution using two additions:

1) a new Empty message type to avoid the overhead (and semantics) of empty record batches 2) a new application_data field in the Message table to pass arbitrary opaque data with any kind of message

Something like:
https://gist.github.com/pitrou/363c4509706f56743f0ca0373f20949c

What do you think?

Regards

Antoine.

Reply via email to