That makes sense indeed. Do we have any more comments on the language of the proposal [1] or should we proceed to vote?
Rok [1] https://github.com/apache/arrow/pull/33925/files On Wed, Feb 22, 2023 at 2:13 PM Antoine Pitrou <anto...@python.org> wrote: > > That's a good point. > > Regards > > Antoine. > > > Le 22/02/2023 à 14:11, Dewey Dunnington a écrit : > > I don't think having both dimension names and permutation is > > redundant...dimension names can also serve as human-readable tags that > help > > a human interpret the values. If reading a NetCDF, for example, one might > > store the dimension variable names. When determining type equality it may > > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H", > "W"]} > > is not equal to {..., permutation = [2, 0, 1], dim_names = ["x", "y", > "z"]}. > > > > On Wed, Feb 22, 2023 at 4:56 AM Rok Mihevc <rok.mih...@gmail.com> wrote: > > > >>> > >>>>> > >>>>> Should we rule that `dim_names` and `permutation` are mutually > >>> exclusive? > >>>>> > >>>> > >>>> Since `dim_names` have to "map to the physical layout (row-major)" > that > >>>> means permutation will always be trivial which indeed makes it > >>> unnecessary > >>>> to store both. > >>> > >>> I don't think it is necessarily needed to explicitly make them > >>> mutually exclusive. I don't know how useful this would in practice, > >>> but you certainly *can* specify both in a meaningful way. Re-using the > >>> example of NHWC data, which is physically stored as NCHW, you can keep > >>> track of this by specifying a permutation of [2, 0, 1], but at the > >>> same time you could also still save the dimension names as ["C", "H", > >>> "W"]. > >>> > >> > >> I'll advocate for the original comment, but I'm ok either way. Having > both > >> `dim_names` and `permutation` is redundant - if the user knows their > >> desired order of `dim_names` they can derive the permutation. If they > don't > >> use `dim_names` they probably don't want them. > >> > > >