Le 14/06/2023 à 17:08, Weston Pace a écrit :

Also, I'm very lukewarm towards the concept of "alternative layouts"
suggested somewhere else in this thread. It does not seem a good choice
to complexify the Arrow format that much.

I think, in my opinion, this depends on how many of these alternative
layouts exist.  If there are just a few, then I agree, we should just adopt
them as formal first-class layouts.  If there are many, then I think it
will be too much complexity in Arrow to have all the different choices.
Or, we could say there are many, but the alternatives don't belong in Arrow
at all.  In that case I think it's the same question as the above
paragraph, "do we want Arrow to be used within systems?  Or just between
systems?"

I think it's both really. Even if you're just moving data between systems, these systems have to understand all or most existing layouts, even if only to convert them to their preferred internal layout.

It's not obvious to me that we'd like to make the introduction of new layouts easier. A long-term concern with specification complexity is that implementations tend to be incomplete and have inconsistent feature sets (see Parquet implementations for an example of that).

Therefore, a situation which forces us to be parcimonious in adding new layouts might actually be beneficial.

Regards

Antoine.

Reply via email to