[jira] [Created] (ARROW-892) [GLib] Fix GArrowTensor document

2017-04-25 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-892: -- Summary: [GLib] Fix GArrowTensor document Key: ARROW-892 URL: https://issues.apache.org/jira/browse/ARROW-892 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-893) Add GLib document to Web site

2017-04-25 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-893: -- Summary: Add GLib document to Web site Key: ARROW-893 URL: https://issues.apache.org/jira/browse/ARROW-893 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-894) [GLib] Add GArrowPoolBuffer

2017-04-25 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-894: -- Summary: [GLib] Add GArrowPoolBuffer Key: ARROW-894 URL: https://issues.apache.org/jira/browse/ARROW-894 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-895) Nullable variable length vector lastSet not set corretly

2017-04-25 Thread Steven Phillips (JIRA)
Steven Phillips created ARROW-895: - Summary: Nullable variable length vector lastSet not set corretly Key: ARROW-895 URL: https://issues.apache.org/jira/browse/ARROW-895 Project: Apache Arrow

Pandas timestamp

2017-04-25 Thread Bryan Cutler
I am writing a unit test to compare that a Pandas DataFrame made by Arrow is equal to one constructed directly with data. The timestamp values are a Python datetime object with a timezone tzinfo object. When I compare the results, the values are equal but the schema is not. Using arrow the type

Re: Pandas timestamp

2017-04-25 Thread Wes McKinney
hi Bryan, You will want to create DataFrame objects having datetime64[ns] columns. There are some examples in the pyarrow test suite: https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_convert_pandas.py#L324 You can convert an array of datetime.datetime objects to datetime64[n

[jira] [Created] (ARROW-896) [Docs] Add Jekyll plugin for including rendered Jupyter notebooks on website

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-896: -- Summary: [Docs] Add Jekyll plugin for including rendered Jupyter notebooks on website Key: ARROW-896 URL: https://issues.apache.org/jira/browse/ARROW-896 Project: Apache

[jira] [Created] (ARROW-897) [GLib] Build arrow-glib as a separate build in the Travis CI build matrix

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-897: -- Summary: [GLib] Build arrow-glib as a separate build in the Travis CI build matrix Key: ARROW-897 URL: https://issues.apache.org/jira/browse/ARROW-897 Project: Apache Arr

Re: Improvements to Apache Arrow website

2017-04-25 Thread Wes McKinney
Thanks to Kouhei we now also have documentation for the GLib C bindings on the website: http://arrow.apache.org/docs/c_glib/ On Sun, Apr 23, 2017 at 11:12 PM, Wes McKinney wrote: > hi folks, > > In advance of the Arrow 0.3 release, I updated our website to use a static > site generator (Jekyll)

[jira] [Created] (ARROW-898) [C++] Expand metadata support to field level, provide for sharing instances of KeyValueMetadata

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-898: -- Summary: [C++] Expand metadata support to field level, provide for sharing instances of KeyValueMetadata Key: ARROW-898 URL: https://issues.apache.org/jira/browse/ARROW-898

[jira] [Created] (ARROW-899) [Docs] Add CHANGELOG

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-899: -- Summary: [Docs] Add CHANGELOG Key: ARROW-899 URL: https://issues.apache.org/jira/browse/ARROW-899 Project: Apache Arrow Issue Type: Improvement Compone

Serialize/deserialize ArrowRecordBatch to/from bytes?

2017-04-25 Thread Li Jin
Hello, I am trying to serialize/deserialize ArrowRecordBatch in Java, but since the API has changed quite a bit from 0.2.0, I struggle to find how to do it correctly. I checked the test for ArrowFileWriter and ArrowFileReader, but it's still not clear to me how to do it. Can some one give an examp

Re: Serialize/deserialize ArrowRecordBatch to/from bytes?

2017-04-25 Thread Julien Le Dem
look at org.apache.arrow.vector.stream.MessageSerializer. There are methods to serialize/deserialize to/from channels. these could be adapted to byte arrays. The apis are usually in terms of bytebuffers On Tue, Apr 25, 2017 at 3:22 PM, Li Jin wrote: > Hello, > > I am trying to serialize/de

Re: Pandas timestamp

2017-04-25 Thread Bryan Cutler
Thanks Wes. I think I've managed to confuse myself pretty good over this, I'm not sure where the fix should be. Spark, by default, will store a timestamp internally with python "time.mktime", which is in local time and not UTC, I believe. If there is a tzinfo object, Spark will use "calendar.tim

Re: Pandas timestamp

2017-04-25 Thread Wes McKinney
>From what you've written I am not sure where the problem is. If you can point us to some unit tests or some other code that is not working, we can help with the "pandas" way of doing things. If changes are needed in PySpark this would be good motivation. On Tue, Apr 25, 2017 at 6:40 PM Bryan Cutl

Re: Serialize/deserialize ArrowRecordBatch to/from bytes?

2017-04-25 Thread Li Jin
Thanks Julien. I will follow https://github.com/apache/arrow/blob/990e2bde758ac8bc6e4497ae1bc37f89b71bb5cf/java/vector/src/test/java/org/apache/arrow/vector/stream/MessageSerializerTest.java#L91

Re: Serialize/deserialize ArrowRecordBatch to/from bytes?

2017-04-25 Thread Wes McKinney
There is also https://github.com/apache/arrow/blob/master/java/veator/src/test/java/org/apache/arrow/vector/file/TestArrowStreamPipe.java On Tue, Apr 25, 2017 at 8:46 PM, Li Jin wrote: > Thanks Julien. I will follow > https://github.com/apache/arrow/blob/990e2bde758ac8bc6e4497ae1bc37f > 89b71bb5

Re: Serialize/deserialize ArrowRecordBatch to/from bytes?

2017-04-25 Thread Wes McKinney
Also, now that we have a website that is easier to write content for (in Markdown), it would be great if some Java developers could volunteer some time to write user-facing documentation to go with the Javadocs. On Tue, Apr 25, 2017 at 8:51 PM, Wes McKinney wrote: > There is also https://github.

[jira] [Created] (ARROW-900) [Python] UnboundLocalError in ParquetDatasetPiece

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-900: -- Summary: [Python] UnboundLocalError in ParquetDatasetPiece Key: ARROW-900 URL: https://issues.apache.org/jira/browse/ARROW-900 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-901) [Python] Write FixedSizeBinary to Parquet

2017-04-25 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-901: -- Summary: [Python] Write FixedSizeBinary to Parquet Key: ARROW-901 URL: https://issues.apache.org/jira/browse/ARROW-901 Project: Apache Arrow Issue Type: Improvem