Hey Nate,

Good catch. I would say it was intentional to use an IPC message (while the 
length is redundant, it also contains the metadata version and custom 
metadata), and the comment is a little vague. I'm not sure if we need a vote to 
update this since it is changing files in the format dir.

I did check the Java implementation quickly and even in the initial version, 
the schema is IPC-encapsulated[1].

-David

[1]: 
https://github.com/apache/arrow/commit/b3cd616f6ce39e9a820b658ba2f2e2fd3b153555#diff-80e8aef2d6b475949641689fe8d686db2cc1d6703cafe8f6f0dc120b04f964b7R106

On Fri, Jul 23, 2021, at 20:28, Nate Bauernfeind wrote:
> In flight.proto [1] it states that the encoded bytes are as described in
> the flatbuffer schema.
> 
> ```
> /*
> * Wrap the result of a getSchema call
> */
> message SchemaResult {
>   // schema of the dataset as described in Schema.fbs::Schema.
>   bytes schema = 1;
> }
> ```
> 
> However, both this schema and the schema on the flight info are actually
> encoded like an IPC file stream message.
> 
> They start with a 4-byte IPC_CONTINUATION_TOKEN, followed by a 4-byte
> message size, followed by a Message wrapping a Schema. See [2] and [3] for
> evidence in the Java client.
> 
> Is this an accidental bug that has propagated to all client implementations?
> Or was it intentional and I should submit a PR to update the comments?
> 
> Thanks,
> Nate
> [1] https://github.com/apache/arrow/blob/master/format/Flight.proto#L195
> [2]
> https://github.com/apache/arrow/blob/master/java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java#L141
> [3]
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageSerializer.java#L161
> --
> 

Reply via email to