[jira] [Created] (ARROW-7305) High memory usage writing pyarrow.Table to parquet

2019-12-03 Thread Bogdan Klichuk (Jira)
Bogdan Klichuk created ARROW-7305: - Summary: High memory usage writing pyarrow.Table to parquet Key: ARROW-7305 URL: https://issues.apache.org/jira/browse/ARROW-7305 Project: Apache Arrow

[jira] [Created] (ARROW-7150) [Python] Explain parquet file size growth

2019-11-12 Thread Bogdan Klichuk (Jira)
Bogdan Klichuk created ARROW-7150: - Summary: [Python] Explain parquet file size growth Key: ARROW-7150 URL: https://issues.apache.org/jira/browse/ARROW-7150 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-6481) Bad performance of read_csv() with column_types

2019-09-07 Thread Bogdan Klichuk (Jira)
Bogdan Klichuk created ARROW-6481: - Summary: Bad performance of read_csv() with column_types Key: ARROW-6481 URL: https://issues.apache.org/jira/browse/ARROW-6481 Project: Apache Arrow Issue

[jira] [Created] (ARROW-5811) pyarrow.csv.read_csv: Ability to not infer column types.

2019-06-30 Thread Bogdan Klichuk (JIRA)
Bogdan Klichuk created ARROW-5811: - Summary: pyarrow.csv.read_csv: Ability to not infer column types. Key: ARROW-5811 URL: https://issues.apache.org/jira/browse/ARROW-5811 Project: Apache Arrow

[jira] [Created] (ARROW-5791) pyarrow.csv.read_csv hangs + eats all RAM

2019-06-29 Thread Bogdan Klichuk (JIRA)
Bogdan Klichuk created ARROW-5791: - Summary: pyarrow.csv.read_csv hangs + eats all RAM Key: ARROW-5791 URL: https://issues.apache.org/jira/browse/ARROW-5791 Project: Apache Arrow Issue Type

Re: Using pyarrow.Table for long-term storage of pandas DataFrames

2019-06-16 Thread Bogdan Klichuk
). > > Thanks, > Micah > > [1] > https://lists.apache.org/thread.html/96e595ce6c3cffa37bad23181abc9372c30457210a65d72d92593ce5@%3Cdev.arrow.apache.org%3E > > On Sun, Jun 16, 2019 at 10:01 PM Bogdan Klichuk > wrote: > >> Hello. Thanks for the reply! >> >> >> >&g

Re: Using pyarrow.Table for long-term storage of pandas DataFrames

2019-06-16 Thread Bogdan Klichuk
tanding is that the formats are nearly identical (mostly just a > > difference in metadata) so the performance similarity isn't surprising. > Alright, so speaking of serialization of pyarrow.Table vs Feather, if they are pretty much the same, but arrow alone shouldn't be used to lo

Using pyarrow.Table for long-term storage of pandas DataFrames

2019-06-12 Thread Bogdan Klichuk
INCLUDED into benchmarks. -- Best wishes, Bogdan Klichuk