.asf.yamlfeatures-GitHubsettings
> > >
> > > github:
> > > description: "Apache Arrow is ..."
> > >
> > > In
> > > "Re: Long title on github page" on Thu, 10 Jun 2021 17:44:57 -0500,
> > > Wes McKinney wrote:
>
o set the description on
> > GitHub:
> >
> >
> https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubsettings
> >
> > github:
> > description: "Apache Arrow is ..."
> >
> > In
> > &q
splay/INFRA/git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubsettings
>
> github:
> description: "Apache Arrow is ..."
>
> In
> "Re: Long title on github page" on Thu, 10 Jun 2021 17:44:57 -0500,
> Wes McKinney wrote:
>
> > I'll wait a day
It seems that we can use .asf.yaml to set the description on
GitHub:
https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubsettings
github:
description: "Apache Arrow is ..."
In
"Re: Long title on github page" on Thu,
I'll wait a day or two for more feedback to percolate and then ask
Infra to change the description on GitHub.
On Thu, Jun 10, 2021 at 4:47 PM Adam Lippai wrote:
>
> +1
>
> On Thu, Jun 10, 2021, 23:38 Antoine Pitrou wrote:
>
> >
> > Sound good enough to me.
> >
> >
> > Le 10/06/2021 à 23:35, Wes
+1
On Thu, Jun 10, 2021, 23:38 Antoine Pitrou wrote:
>
> Sound good enough to me.
>
>
> Le 10/06/2021 à 23:35, Wes McKinney a écrit :
> > I hate to reopen this can of worms again, but here is my effort to
> > synthesize feedback:
> >
> > "Apache Arrow is a multi-language toolbox for accelerated
Sound good enough to me.
Le 10/06/2021 à 23:35, Wes McKinney a écrit :
I hate to reopen this can of worms again, but here is my effort to
synthesize feedback:
"Apache Arrow is a multi-language toolbox for accelerated data
interchange and in-memory processing."
On Thu, Jun 10, 2021 at 12:37
I hate to reopen this can of worms again, but here is my effort to
synthesize feedback:
"Apache Arrow is a multi-language toolbox for accelerated data
interchange and in-memory processing."
On Thu, Jun 10, 2021 at 12:37 PM Dominik Moritz wrote:
>
> I thought there were some good suggestions in t
I thought there were some good suggestions in this thread. @Wes, did you
find a description you liked?
On May 18, 2021 at 06:24:47, Adam Hooper wrote:
> Poll question: why did you choose Arrow?
>
> Personally: I researched Arrow because it's a spec for IPC. (My requirement
> was: "wrap computati
Poll question: why did you choose Arrow?
Personally: I researched Arrow because it's a spec for IPC. (My requirement
was: "wrap computations in a separate process.") I chose Arrow for its
community and ecosystem -- in other words, because my peers chose it.
I happen to use the compute kernel and
"Apache Arrow is a data processing library that also provides a uniform,
efficient interface for data systems."
This probably still isn't quite right, I imagine the bit about "for data
systems" needs some addition (maybe "for transport between data systems")?
My primary motivators:
- "A data
I'd avoid the word "structured" as it is somewhat ill-defined.
On Mon, May 17, 2021 at 12:37 PM Mauricio Vargas
wrote:
>
> more marketed:
> How about: "Apache Arrow is a format and language-agnostic library focused
> on efficient sharing and processing of structured data."
>
> On Mon, May 17, 202
more marketed:
How about: "Apache Arrow is a format and language-agnostic library focused
on efficient sharing and processing of structured data."
On Mon, May 17, 2021 at 6:25 PM Micah Kornfield
wrote:
> How about: "Apache Arrow is a collection of specifications, cross language
> libraries and a
How about: "Apache Arrow is a collection of specifications, cross language
libraries and applications focused on efficient sharing and processing of
structured data."
On Mon, May 17, 2021 at 3:06 PM Wes McKinney wrote:
> On Mon, May 17, 2021 at 4:58 PM Weston Pace wrote:
> >
> > > “Apache Arrow
On Mon, May 17, 2021 at 4:58 PM Weston Pace wrote:
>
> > “Apache Arrow is a format and compute kernel for in-memory data”
>
> I like this but no one ever knows what "in-memory" means (or they just
> think 'data is always in memory'). How about...
>
> "Apache Arrow is a format and compute kernel f
> “Apache Arrow is a format and compute kernel for in-memory data”
I like this but no one ever knows what "in-memory" means (or they just
think 'data is always in memory'). How about...
"Apache Arrow is a format and compute kernel for zero-copy processing
and sharing of data."
or...
"Apache Ar
a few ideas
github.com/apache/arrow - Apache Arrow is an efficient library for big data
processing and sharing
github.com/apache/arrow - Apache Arrow is a computational tool for
processing, storing and sharing large datasets
github.com/apache/arrow - Apache Arrow is a fast and simple library fo
Alright, well, whatever it is, it must fit into one breath. If the high-concept
pitch is successful, people will stick around for the full pitch.
Words such as “platform” and “enable” are noise. You say “platform”, they start
to say “what exactly do you mean by platform”, the elevator doors open
Hi,
I'm 100% behind Wes.
Being not just a file format, but adding compute and libs are the best
selling points of Arrow.
It shouldn't be reduced to "a file format and it's utils", as the ecosystem
is at least that important.
This is something we have to emphasize constantly.
Best regards,
Adam Li
One more suggestion for the bucket:
"Apache Arrow is a computational platform for efficient in-memory data
representation and processing."
On Mon, May 17, 2021 at 2:49 PM Wes McKinney wrote:
> I think less is better in the description, but unfortunately the
> association of Arrow as being "just
I think less is better in the description, but unfortunately the
association of Arrow as being "just a data format" has been actively
harmful in some ways to community growth. We have a data format, yes,
but we are also creating a computational platform to go hand-in-hand
with the data format to ma
sorry to come with a marketing-style title, but how about
github.com/apache/arrow - Apache Arrow is an efficient format for big data
processing and sharing
?
On Mon, May 17, 2021 at 1:15 PM Julian Hyde wrote:
> I think that the “cross-language development platform for” is noise. (I’m
> sure tha
I think that the “cross-language development platform for” is noise. (I’m sure
that JPEG developers think that JPEG is a “cross-language development platform”
too. But it isn’t. It is an image format.)
"Apache Arrow is data format for efficient in-memory processing.”
I’ll note that In marketing
I agree with Nate's and Brian's suggestions, but would like to add that we
can make it a one-liner for more conciseness and consistency with other
Apache projects.
Apologies if it seems I am going around the suggestions loop again.
"Apache Arrow is a cross-language development platform enabling ef
Thank you for bringing this up Dominik. I sampled some of the descriptions
for other Apache projects I frequent, the ones with a meaningful
description have a single sentence:
github.com/apache/spark - Apache Spark - A unified analytics engine for
large-scale data processing
github.com/apache/beam
It's probably best for description to limit mentions of specific
features. There are some high level features mentioned in the
description now ("computational libraries and zero-copy streaming
messaging and interprocess communication"), but now in 2021 since the
project has grown so much, it could
How about
'Apache Arrow is a cross-language development platform for in-memory data.
It enables systems to process and transport data efficiently, providing a
simple and fast library for partitioning of large tables'?
Sorry the delay, long election day
On Sun, May 16, 2021, 2:27 PM Nate Bauernfei
Suggestion: faster -> more efficiently
"Apache Arrow is a cross-language development platform for in-memory
data. It enables systems to process and transport data more efficiently."
On Sun, May 16, 2021 at 11:35 AM Wes McKinney wrote:
> Here's what there now:
>
> "Apache Arrow is a cross-langua
Here's what there now:
"Apache Arrow is a cross-language development platform for in-memory
data. It specifies a standardized language-independent columnar memory
format for flat and hierarchical data, organized for efficient
analytic operations on modern hardware. It also provides computational
l
Super minor issue but could someone make the description on GitHub shorter?
🙏
GitHub puts the description into the title of the page and makes it hard to
find it in URL autocomplete.
30 matches
Mail list logo