similarly to Micah, I mentally think of "Arrow IPC" a format that is optimized for "IPC". Which I have assumed meant it minimizes CPU overhead when using data read from storage because it's already in a memory friendly format (e.g. minimal deserialization).
Not sure the "IPC" is necessary, but it does push the intent into the name (unless it's actually a misnomer). Aldrin Montana Computer Science PhD Student UC Santa Cruz On Tue, Aug 30, 2022 at 8:29 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > I think one source of ambiguity for Arrow files, at least for me, is > whether they are just a string of messages concatenated or they are the > files that contain the metadata footer. > > On Tue, Aug 30, 2022 at 5:11 AM Dewey Dunnington > <de...@voltrondata.com.invalid> wrote: > > > Ian has a very good point...I would be in favour of calling them "Arrow > > files" wherever possible since there's no need to know or care what > > interprocess communication is to use them! > > > > On Mon, Aug 29, 2022 at 6:50 PM Ian Cook <i...@ursacomputing.com> wrote: > > > > > +1 We should explicitly discourage further use of “Feather” to refer to > > > Arrow IPC files. > > > > > > In this spirit of simplifying terminology: Does the “IPC” in the term > > > “Arrow IPC files” serve a truly necessary purpose? Is there another > type > > of > > > “Arrow file” that the “IPC” serves to disambiguate? If not, can we > simply > > > refer to these files as “Arrow files” in most places in the > documentation > > > and website? (In a few important places we should clarify that when we > > say > > > “Arrow file” we are referring to a file that uses the Arrow IPC file > > > format.) > > > > > > Ian > > > > > > On Mon, Aug 29, 2022 at 17:33 Sutou Kouhei <k...@clear-code.com> wrote: > > > > > > > +1 for 1. > > > > > > > > Thanks, > > > > -- > > > > kou > > > > > > > > In <CAOYPqDCAib2wBKaKnRij9=__OsUJJghVq1UUTNibK2T0Np+= > r...@mail.gmail.com > > > > > > > "Re: Usage of the name Feather?" on Mon, 29 Aug 2022 20:18:37 > +0200, > > > > Jorge Cardoso Leitão <jorgecarlei...@gmail.com> wrote: > > > > > > > > > I agree. > > > > > > > > > > I suspect that the most widely used API with "feather" is Pandas' > > > > > read_feather. > > > > > > > > > > > > > > > > > > > > On Mon, 29 Aug 2022, 19:55 Weston Pace, <weston.p...@gmail.com> > > wrote: > > > > > > > > > >> I agree as well. I think most lingering uses of the term > "feather" > > > > >> are in pyarrow and R however, so it might be good to hear from > some > > of > > > > >> those maintainers. > > > > >> > > > > >> > > > > >> > > > > >> On Mon, Aug 29, 2022 at 9:35 AM Antoine Pitrou < > anto...@python.org> > > > > wrote: > > > > >> > > > > > >> > > > > > >> > I agree with this as well. > > > > >> > > > > > >> > Regards > > > > >> > > > > > >> > Antoine. > > > > >> > > > > > >> > > > > > >> > On Mon, 29 Aug 2022 11:29:45 -0400 > > > > >> > Andrew Lamb <al...@influxdata.com> wrote: > > > > >> > > In the rust implementation we use the term "Arrow IPC" and I > > > support > > > > >> your > > > > >> > > option 1: > > > > >> > > > > > > >> > > > The name Feather V2 is deprecated. Only the extension > ".arrow" > > > > will > > > > >> be > > > > >> > > used for IPC files. > > > > >> > > > > > > >> > > Andrew > > > > >> > > > > > > >> > > On Mon, Aug 29, 2022 at 11:21 AM Matthew Topol > > > > >> <m...@voltrondata.com.invalid> > > > > >> > > wrote: > > > > >> > > > > > > >> > > > When I wrote "In-Memory Analytics with Apache Arrow" I > > > definitely > > > > >> > > > treated "Feather" as deprecated and mentioned it only in > > passing > > > > >> > > > specifically indicating "Arrow IPC" as the terminology to > > use. I > > > > only > > > > >> > > > even mentioned "Feather" at all because there are still > > methods > > > in > > > > >> > > > pyarrow that reference it by name. > > > > >> > > > > > > > >> > > > That's just my opinion though... > > > > >> > > > > > > > >> > > > On Mon, Aug 29 2022 at 11:08:53 AM -0400, David Li > > > > >> > > > <lidav...@apache.org> wrote: > > > > >> > > > > This has come up before, e.g. see [1] [2] [3]. > > > > >> > > > > > > > > >> > > > > I would say "Feather" is effectively deprecated and we are > > > using > > > > >> > > > > "Arrow IPC" now but I am not sure what others think. (From > > > that > > > > >> > > > > GitHub link, it seems to be mixed.) And ".arrow" is the > > > official > > > > >> > > > > extension now (since it is registered as part of our MIME > > > type). > > > > >> But > > > > >> > > > > there's existing documentation and not everything has been > > > > updated > > > > >> to > > > > >> > > > > be consistent (as you saw). > > > > >> > > > > > > > > >> > > > > [1]: > > > > >> > > > > < > > > > https://lists.apache.org/thread/0s6lgvd3g56ymd60vl5lgzhf4ro6hts5> > > > > >> > > > > [2]: > > > > >> > > > > < > > > > https://arrow.apache.org/faq/#what-about-the-feather-file-format> > > > > >> > > > > [3]: > > > > >> > > > > < > > > > >> > > > > > > > >> > > > > > > > > > > https://stackoverflow.com/questions/67910612/arrow-ipc-vs-feather/67911190#67911190 > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > -David > > > > >> > > > > > > > > >> > > > > On Mon, Aug 29, 2022, at 10:50, 島 達也 wrote: > > > > >> > > > >> Hi all. > > > > >> > > > >> > > > > >> > > > >> I know the documentation (mainly pyarrow documentation) > > > > sometimes > > > > >> > > > >> refers > > > > >> > > > >> to IPC files as Feather files, but are there any > > guidelines > > > > for > > > > >> > > > >> when to > > > > >> > > > >> refer to an IPC file as a Feather file and when to refer > > to > > > > it as > > > > >> > > > >> an IPC > > > > >> > > > >> file? > > > > >> > > > >> I believe that calling the same file an Arrow IPC file > at > > > > times > > > > >> and > > > > >> > > > >> a > > > > >> > > > >> Feather file at other times is confusing to those > > unfamiliar > > > > with > > > > >> > > > >> Apache > > > > >> > > > >> Arrow (myself included). > > > > >> > > > >> Surprisingly, these files may even have completely > > different > > > > >> > > > >> extensions, > > > > >> > > > >> ".arrow" and ".feather", which are not similar. > > > > >> > > > >> > > > > >> > > > >> Perhaps there are several options for future use of the > > name > > > > >> > > > >> Feather, > > > > >> > > > >> such as > > > > >> > > > >> > > > > >> > > > >> 1. The name Feather V2 is deprecated. Only the > extension > > > > >> ".arrow" > > > > >> > > > >> will > > > > >> > > > >> be used for IPC files. > > > > >> > > > >> 2. In some contexts(?), IPC files are referred to as > > > Feather; > > > > >> only > > > > >> > > > >> ".arrow" is used for the IPC file extension to > clearly > > > > >> > > > >> distinguish > > > > >> > > > >> it from Feather V1's ".feather". > > > > >> > > > >> 3. When an IPC file is called Feather by some rule, > > > extension > > > > >> > > > >> ".feather" is used, and when an IPC file is not > called > > > > >> Feather, > > > > >> > > > >> extension ".arrow" is used. > > > > >> > > > >> > > > > >> > > > >> I mistakenly thought the current status was 2, but > > according > > > > to > > > > >> the > > > > >> > > > >> discussion in this PR > > > > >> > > > >> (<https://github.com/apache/arrow/pull/13677>), > > > > >> > > > >> apparently the current status seems 3. (However, there > > seems > > > > to > > > > >> be > > > > >> > > > >> no > > > > >> > > > >> rule as to when an IPC file should be called a Feather) > > > > >> > > > >> > > > > >> > > > >> I am not very familiar with Arrow and this is my first > > post > > > to > > > > >> this > > > > >> > > > >> mailing list so I apologize if I have done something > wrong > > > or > > > > >> > > > >> inappropriate. > > > > >> > > > >> > > > > >> > > > >> Best, > > > > >> > > > >> SHIMA Tatsuya > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > > > > > >