That makes sense indeed.
Do we have any more comments on the language of the proposal [1] or should
we proceed to vote?

Rok

[1] https://github.com/apache/arrow/pull/33925/files

On Wed, Feb 22, 2023 at 2:13 PM Antoine Pitrou <anto...@python.org> wrote:

>
> That's a good point.
>
> Regards
>
> Antoine.
>
>
> Le 22/02/2023 à 14:11, Dewey Dunnington a écrit :
> > I don't think having both dimension names and permutation is
> > redundant...dimension names can also serve as human-readable tags that
> help
> > a human interpret the values. If reading a NetCDF, for example, one might
> > store the dimension variable names. When determining type equality it may
> > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H",
> "W"]}
> > is not equal to {..., permutation = [2, 0, 1], dim_names = ["x", "y",
> "z"]}.
> >
> > On Wed, Feb 22, 2023 at 4:56 AM Rok Mihevc <rok.mih...@gmail.com> wrote:
> >
> >>>
> >>>>>
> >>>>> Should we rule that `dim_names` and `permutation` are mutually
> >>> exclusive?
> >>>>>
> >>>>
> >>>> Since `dim_names` have to "map to the physical layout (row-major)"
> that
> >>>> means permutation will always be trivial which indeed makes it
> >>> unnecessary
> >>>> to store both.
> >>>
> >>> I don't think it is necessarily needed to explicitly make them
> >>> mutually exclusive. I don't know how useful this would in practice,
> >>> but you certainly *can* specify both in a meaningful way. Re-using the
> >>> example of NHWC data, which is physically stored as NCHW, you can keep
> >>> track of this by specifying a permutation of [2, 0, 1], but at the
> >>> same time you could also still save the dimension names as ["C", "H",
> >>> "W"].
> >>>
> >>
> >> I'll advocate for the original comment, but I'm ok either way. Having
> both
> >> `dim_names` and `permutation` is redundant - if the user knows their
> >> desired order of `dim_names` they can derive the permutation. If they
> don't
> >> use `dim_names` they probably don't want them.
> >>
> >
>

Reply via email to