a few ideas github.com/apache/arrow - Apache Arrow is an efficient library for big data processing and sharing
github.com/apache/arrow - Apache Arrow is a computational tool for processing, storing and sharing large datasets github.com/apache/arrow - Apache Arrow is a fast and simple library for big data analytics *github.com/apache/arrow <http://github.com/apache/arrow> - Apache Arrow is a powerful workhorse for analytic operations on modern hardware* On Mon, May 17, 2021 at 3:13 PM Julian Hyde <jhyde.apa...@gmail.com> wrote: > Alright, well, whatever it is, it must fit into one breath. If the > high-concept pitch is successful, people will stick around for the full > pitch. > > Words such as “platform” and “enable” are noise. You say “platform”, they > start to say “what exactly do you mean by platform”, the elevator doors > open, and they’re gone. > > “Apache Arrow is a format and compute kernel for in-memory data” > > > > On May 17, 2021, at 12:03 PM, Eduardo Ponce <edponc...@gmail.com> wrote: > > > > One more suggestion for the bucket: > > "Apache Arrow is a computational platform for efficient in-memory data > > representation and processing." > > > > On Mon, May 17, 2021 at 2:49 PM Wes McKinney <wesmck...@gmail.com> > wrote: > > > >> I think less is better in the description, but unfortunately the > >> association of Arrow as being "just a data format" has been actively > >> harmful in some ways to community growth. We have a data format, yes, > >> but we are also creating a computational platform to go hand-in-hand > >> with the data format to make it easier to build fast applications that > >> use the data format. So the description needs to capture both of these > >> ideas. > >> > >> On Mon, May 17, 2021 at 12:15 PM Julian Hyde <jhyde.apa...@gmail.com> > >> wrote: > >>> > >>> I think that the “cross-language development platform for” is noise. > >> (I’m sure that JPEG developers think that JPEG is a “cross-language > >> development platform” too. But it isn’t. It is an image format.) > >>> > >>> "Apache Arrow is data format for efficient in-memory processing.” > >>> > >>> I’ll note that In marketing speak, we are developing a high-concept > >> pitch [1] here. Every company needs a name, a brand, a high-concept > pitch, > >> and 3- or 4-sentence description. But every Apache project needs these > too. > >> It’s worth spending the time on the description, also, and then use > them in > >> all the places that we describe Arrow. > >>> > >>> Julian > >>> > >>> [1] https://www.growthink.com/content/whats-your-high-concept-pitch > >>> > >>> > >>> > >>>> On May 17, 2021, at 7:38 AM, Eduardo Ponce <edponc...@gmail.com> > >> wrote: > >>>> > >>>> I agree with Nate's and Brian's suggestions, but would like to add > >> that we > >>>> can make it a one-liner for more conciseness and consistency with > other > >>>> Apache projects. > >>>> Apologies if it seems I am going around the suggestions loop again. > >>>> > >>>> "Apache Arrow is a cross-language development platform enabling > >> efficient > >>>> in-memory data processing and transport." > >>>> > >>>> > >>>> > >>>> > >>>> On Mon, May 17, 2021 at 10:11 AM Brian Hulette <bhule...@apache.org> > >> wrote: > >>>> > >>>>> Thank you for bringing this up Dominik. I sampled some of the > >> descriptions > >>>>> for other Apache projects I frequent, the ones with a meaningful > >>>>> description have a single sentence: > >>>>> > >>>>> github.com/apache/spark - Apache Spark - A unified analytics engine > >> for > >>>>> large-scale data processing > >>>>> github.com/apache/beam - Apache Beam is a unified programming model > >> for > >>>>> Batch and Streaming > >>>>> github.com/apache/avro - Apache Avro is a data serialization system > >>>>> > >>>>> Several others (Flink, Hadoop, ...) just have "[Mirror of] Apache > >> <name>" > >>>>> as the description. > >>>>> > >>>>> +1 for Nate's suggestion "Apache Arrow is a cross-language > development > >>>>> platform for in-memory data. It enables systems to process and > >> transport > >>>>> data more efficiently." > >>>>> > >>>>> On Mon, May 17, 2021 at 5:23 AM Wes McKinney <wesmck...@gmail.com> > >> wrote: > >>>>> > >>>>>> It's probably best for description to limit mentions of specific > >>>>>> features. There are some high level features mentioned in the > >>>>>> description now ("computational libraries and zero-copy streaming > >>>>>> messaging and interprocess communication"), but now in 2021 since > the > >>>>>> project has grown so much, it could leave people with a limited view > >>>>>> of what they might find here. > >>>>>> > >>>>>> On Mon, May 17, 2021 at 12:14 AM Mauricio Vargas > >>>>>> <mauri...@ursacomputing.com> wrote: > >>>>>>> > >>>>>>> How about > >>>>>>> 'Apache Arrow is a cross-language development platform for > in-memory > >>>>>> data. > >>>>>>> It enables systems to process and transport data efficiently, > >>>>> providing a > >>>>>>> simple and fast library for partitioning of large tables'? > >>>>>>> > >>>>>>> Sorry the delay, long election day > >>>>>>> > >>>>>>> On Sun, May 16, 2021, 2:27 PM Nate Bauernfeind < > >>>>>> natebauernfe...@deephaven.io> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Suggestion: faster -> more efficiently > >>>>>>>> > >>>>>>>> "Apache Arrow is a cross-language development platform for > >> in-memory > >>>>>>>> data. It enables systems to process and transport data more > >>>>>> efficiently." > >>>>>>>> > >>>>>>>> On Sun, May 16, 2021 at 11:35 AM Wes McKinney < > wesmck...@gmail.com > >>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> Here's what there now: > >>>>>>>>> > >>>>>>>>> "Apache Arrow is a cross-language development platform for > >>>>> in-memory > >>>>>>>>> data. It specifies a standardized language-independent columnar > >>>>>> memory > >>>>>>>>> format for flat and hierarchical data, organized for efficient > >>>>>>>>> analytic operations on modern hardware. It also provides > >>>>>> computational > >>>>>>>>> libraries and zero-copy streaming messaging and interprocess > >>>>>>>>> communication…" > >>>>>>>>> > >>>>>>>>> How about something shorter like > >>>>>>>>> > >>>>>>>>> "Apache Arrow is a cross-language development platform for > >>>>> in-memory > >>>>>>>>> data. It enables systems to process and transport data faster." > >>>>>>>>> > >>>>>>>>> Suggestions / refinements from others welcome > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Sat, May 15, 2021 at 9:12 PM Dominik Moritz <domor...@cmu.edu > > > >>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Super minor issue but could someone make the description on > >>>>> GitHub > >>>>>>>>> shorter? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> GitHub puts the description into the title of the page and makes > >>>>> it > >>>>>>>> hard > >>>>>>>>> to find it in URL autocomplete. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>> > >>>>> > >>> > >> > >