I agree with Nate's and Brian's suggestions, but would like to add that we
can make it a one-liner for more conciseness and consistency with other
Apache projects.
Apologies if it seems I am going around the suggestions loop again.

"Apache Arrow is a cross-language development platform enabling efficient
in-memory data processing and transport."




On Mon, May 17, 2021 at 10:11 AM Brian Hulette <bhule...@apache.org> wrote:

> Thank you for bringing this up Dominik. I sampled some of the descriptions
> for other Apache projects I frequent, the ones with a meaningful
> description have a single sentence:
>
> github.com/apache/spark - Apache Spark - A unified analytics engine for
> large-scale data processing
> github.com/apache/beam - Apache Beam is a unified programming model for
> Batch and Streaming
> github.com/apache/avro - Apache Avro is a data serialization system
>
> Several others (Flink, Hadoop, ...) just have  "[Mirror of] Apache <name>"
> as the description.
>
> +1 for Nate's suggestion "Apache Arrow is a cross-language development
> platform for in-memory data. It enables systems to process and transport
> data more efficiently."
>
> On Mon, May 17, 2021 at 5:23 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > It's probably best for description to limit mentions of specific
> > features. There are some high level features mentioned in the
> > description now ("computational libraries and zero-copy streaming
> > messaging and interprocess communication"), but now in 2021 since the
> > project has grown so much, it could leave people with a limited view
> > of what they might find here.
> >
> > On Mon, May 17, 2021 at 12:14 AM Mauricio Vargas
> > <mauri...@ursacomputing.com> wrote:
> > >
> > > How about
> > > 'Apache Arrow is a cross-language development platform for in-memory
> > data.
> > > It enables systems to process and transport data efficiently,
> providing a
> > > simple and fast library for partitioning of large tables'?
> > >
> > > Sorry the delay, long election day
> > >
> > > On Sun, May 16, 2021, 2:27 PM Nate Bauernfeind <
> > natebauernfe...@deephaven.io>
> > > wrote:
> > >
> > > > Suggestion: faster -> more efficiently
> > > >
> > > > "Apache Arrow is a cross-language development platform for in-memory
> > > > data. It enables systems to process and transport data more
> > efficiently."
> > > >
> > > > On Sun, May 16, 2021 at 11:35 AM Wes McKinney <wesmck...@gmail.com>
> > wrote:
> > > >
> > > > > Here's what there now:
> > > > >
> > > > > "Apache Arrow is a cross-language development platform for
> in-memory
> > > > > data. It specifies a standardized language-independent columnar
> > memory
> > > > > format for flat and hierarchical data, organized for efficient
> > > > > analytic operations on modern hardware. It also provides
> > computational
> > > > > libraries and zero-copy streaming messaging and interprocess
> > > > > communication…"
> > > > >
> > > > > How about something shorter like
> > > > >
> > > > > "Apache Arrow is a cross-language development platform for
> in-memory
> > > > > data. It enables systems to process and transport data faster."
> > > > >
> > > > > Suggestions / refinements from others welcome
> > > > >
> > > > >
> > > > > On Sat, May 15, 2021 at 9:12 PM Dominik Moritz <domor...@cmu.edu>
> > wrote:
> > > > > >
> > > > > > Super minor issue but could someone make the description on
> GitHub
> > > > > shorter?
> > > > > >
> > > > > >
> > > > > >
> > > > > > GitHub puts the description into the title of the page and makes
> it
> > > > hard
> > > > > to find it in URL autocomplete.
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> >
>

Reply via email to