I'm -0.9 on Arrow Compute engine. It makes it sound like it is THE canonical Arrow one, second classing Datafusion and Gandiva.
No strong feelings on other names. Naming in general is an extremely subjective process... On Thu, Mar 31, 2022, 2:33 PM Weston Pace <weston.p...@gmail.com> wrote: > I'm +1 for "arrow compute engine". In the docs we currently refer to > it as the "streaming execution engine". I do like the word > "streaming" as it is the difference between the engine and the general > "compute" module but the word is also overloaded and we can easily > include the word "streaming" in the first sentence of whatever > description we have for the engine. > > > I'd personally like to see such a word for the query engine (otherwise > we'd > > have to call Arrow Flight "Arrow Wire Protocol" 😅). Even something like > > "Arrow Archer" or "Arrow Bow" would be sufficient for me. > > I do like the idea of calling it just "bow" and I'm not against either > of these names (+0). I think I still lean towards something more > plain and descriptive (arrow wire protocol has a nice ring to it...) > > On Tue, Mar 29, 2022 at 9:10 AM Sasha Krassovsky > <krassovskysa...@gmail.com> wrote: > > > > In my view, the Arrow project has the core format specification (called > > Arrow), and then ancillary libraries for actually *doing* stuff with > Arrow > > data, such as Arrow Flight and the query engine (within the `arrow` > > subdirectory in particular). I think these ancillary libraries should all > > follow a similar naming convention. Seems like the precedent set by Arrow > > Flight is "Arrow <mildly archery-related, descriptive word>", so I'd > > personally like to see such a word for the query engine (otherwise we'd > > have to call Arrow Flight "Arrow Wire Protocol" 😅). Even something like > > "Arrow Archer" or "Arrow Bow" would be sufficient for me. > > > > Sasha Krassovsky > > > > > > > > On Tue, Mar 29, 2022 at 9:25 AM Gavin Ray <ray.gavi...@gmail.com> wrote: > > > > > "Arrow Compute Engine" sounds quite nice to me, tbh > > > Agreeing with the points made above about ACE being difficult to > google, > > > and AQE being a loaded term in query engines already. > > > > > > > > > On Tue, Mar 29, 2022 at 10:07 AM Andy Grove <andygrov...@gmail.com> > wrote: > > > > > > > Just my 2 cents on this. If you were to call it ACE, I would make > the C > > > > stand for "Compute" rather than C++ since it is intended to be used > from > > > > other languages, such as Python. > > > > > > > > The problem with ACE is that is a common word and it will make it > hard to > > > > Google for documentation. Even the combination of Arrow and ACE > already > > > has > > > > plenty of results. > > > > > > > > Also, I saw in the linked doc a reference to AQE (for Arrow Query > > > Engine). > > > > I would not recommend using this since many people know AQE as > Adaptive > > > > Query Execution (especially Spark users). > > > > > > > > "Arrow Compute Engine" in full doesn't sound bad perhaps? > > > > > > > > With DataFusion, I made a list of words related to the project (data, > > > > query, compute, engine, etc) and then a list of completely unrelated > > > words > > > > and then looked at the combinations to see what sounded good to me. > > > > > > > > Andy. > > > > > > > > > > > > > > > > > > > > On Mon, Mar 28, 2022 at 4:31 PM Antoine Pitrou <anto...@python.org> > > > wrote: > > > > > > > > > > > > > > ACE is already the name of a well-known C++ library, though I'm not > > > sure > > > > > how widely used it is nowadays : > > > > > http://www.dre.vanderbilt.edu/~schmidt/ACE.html > > > > > > > > > > I would name it "execution engine" or "Arrow C++ execution engine" > in > > > > full. > > > > > > > > > > Regards > > > > > > > > > > Antoine. > > > > > > > > > > > > > > > Le 29/03/2022 à 00:15, Wes McKinney a écrit : > > > > > > hi all, > > > > > > > > > > > > There has been a steady stream of work over the last year and a > half > > > > > > or so to create a set of query engine building blocks in C++ to > > > > > > evaluate queries against Arrow Datasets and input streams, which > can > > > > > > be of use to applications that are already building on top of the > > > > > > Arrow C++ project. This effort has a smaller surface area than > > > > > > DataFusion since SQL parsing and query optimization are being > left to > > > > > > other tools. > > > > > > > > > > > > I thought it would be useful to have a name for this subproject > > > > > > similar to how we have Gandiva, Plasma, DataFusion, and other > named > > > > > > Apache Arrow subprojects. We had discussed creating a project > like > > > > > > this a few years ago [1], but since there are now multiple > > > > > > Arrow-native or Arrow-compatible query engines in the wild, it > would > > > > > > be helpful to disambiguate. > > > > > > > > > > > > One simple name is ACE — Arrow C++ Engine. I'm not very good at > > > naming > > > > > > things, so if there are other suggestions from the community I > would > > > > > > love to hear them! > > > > > > > > > > > > Thanks, > > > > > > Wes > > > > > > > > > > > > [1]: > > > > > > > > > > > > > https://docs.google.com/document/d/10RoUZmiMQRi_J1FcPeVAUAMJ6d_ZuiEbaM2Y33sNPu4/edit#heading=h.2k6k5a4y9b8y > > > > > > > > > > > > >