Put that answer on the front page of the web site.
Well said.
On Mon, Feb 22, 2016 at 2:05 PM, Wes McKinney wrote:
> hi Stuart,
>
> Currently pandas and NumPy only support flat, non-nested data. Nested
> data includes column value types including arrays, structs, maps, and
> unions. This enab
Many perspectives in the Hadoop "how to contribute" doc apply to Apache
projects in general: https://wiki.apache.org/hadoop/HowToContribute
I'll leave to active Arrow contributors to add Arrow-specific rules. I
think creating a JIRA ticket is always a good starting point (assuming you
already have
hi Stuart,
Currently pandas and NumPy only support flat, non-nested data. Nested
data includes column value types including arrays, structs, maps, and
unions. This enables you to analyze JSON-like data natively in-memory
without pre-flattening or normalization.
There's also an open question about
Hey Wes,
Very exciting to see things moving along on the Python front. As you state
in your post, fast, ubiquitous columnar data will be a great foundation,
especially for more modern data processing and ETL tools. Though I am a
bit curious what you mean by nested columnar data...
Thanks,
Stuar
hi all,
I did a little bit of analysis of the costs of serialization bottlenecks in
data access for Python pandas users and how (at a high level, no perf
numbers yet!) Apache Arrow will help:
http://wesmckinney.com/blog/pandas-and-apache-arrow/
Feedback and comments welcome.
cheers,
Wes
Hi Team,
I want to start contributing to Arrow ASF ,Please let me know how to
proceed .
My Skill
1) Big-data (MR,pig,hive,sqoop,Java,RDBMS(Teradata),shell script)
Thanks,
-Vikas