[jira] [Created] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
Pac A. He created ARROW-11456: - Summary: OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements Key: ARROW-11456 URL: https://issues.apache.org/jira/browse/ARROW-11456

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Environment: pyarrow 3.0.0 / 2.0.0 pandas 1.2.1 python 3.8.6 was: pyarrow 3.0.0 / 2.0.0 pandas 1.2.

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276501#comment-17276501 ] Pac A. He commented on ARROW-11456: --- [~jorisvandenbossche] This is very difficult in t

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:20 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:21 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:21 PM:

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276536#comment-17276536 ] Pac A. He commented on ARROW-11456: --- [~apitrou] Yes, absolutely. I had used pandas 1.2

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Created] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
Pac A. He created ARROW-11464: - Summary: [Python] pyarrow.parquet.read_pandas doesn't conform to its docs Key: ARROW-11464 URL: https://issues.apache.org/jira/browse/ARROW-11464 Project: Apache Arrow

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}}  [implementation|https://github.com/apache/arrow/bl

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-02 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277234#comment-17277234 ] Pac A. He commented on ARROW-11456: --- For what it's worth, {{fastparquet}} v0.5.0 had n

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-02 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-02 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-02 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277234#comment-17277234 ] Pac A. He edited comment on ARROW-11456 at 2/2/21, 4:12 PM:

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-02 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-04 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278976#comment-17278976 ] Pac A. He commented on ARROW-11456: --- Unfortunately I have not been able to produce a r

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-04 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278976#comment-17278976 ] Pac A. He edited comment on ARROW-11456 at 2/4/21, 5:08 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-04 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278976#comment-17278976 ] Pac A. He edited comment on ARROW-11456 at 2/4/21, 5:09 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-04 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278976#comment-17278976 ] Pac A. He edited comment on ARROW-11456 at 2/4/21, 5:09 PM:

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final = pd.re

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error: {noformat} df: Final = pd.read

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error: {noformat} df: Final = pd.read

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error: {noformat} df: Final = pd.read

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error: {noformat} df: Final = pd.read

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error: {noformat} df: Final = pd.read

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Environment: pyarrow 3.0.0 / 2.0.0 pandas 1.1.5 / 1.2.1 smart_open 4.1.2 python 3.8.6 was: pyarrow

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279918#comment-17279918 ] Pac A. He commented on ARROW-11456: --- I see. I have now added code to reproduce the iss

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading or writing a large parquet file, I have this error: {noformat} df: Fina

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading or writing a large parquet file, I have this error: {noformat} df: Fina

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279918#comment-17279918 ] Pac A. He edited comment on ARROW-11456 at 2/5/21, 7:01 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-09 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281869#comment-17281869 ] Pac A. He edited comment on ARROW-11456 at 2/9/21, 4:22 PM:

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-09 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281869#comment-17281869 ] Pac A. He commented on ARROW-11456: --- We have seen that there are one or more pyarrow l

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-09 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281869#comment-17281869 ] Pac A. He edited comment on ARROW-11456 at 2/9/21, 4:22 PM:

[jira] [Created] (ARROW-10152) "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow"

2020-10-01 Thread Pac A. He (Jira)
Pac A. He created ARROW-10152: - Summary: "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow" Key: ARROW-10152 URL: https://issues.apache.org/jira/browse/ARROW-10152 Project: Apache

[jira] [Updated] (ARROW-10152) "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow"

2020-10-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-10152: -- Description: I cannot run "{{import pyarrow}}" with {{pyarrow=1.0.1}} in dockerized miniconda. It wor

[jira] [Commented] (ARROW-10152) "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow"

2020-10-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208154#comment-17208154 ] Pac A. He commented on ARROW-10152: --- There is nothing wrong with `environment.yml`. Th

[jira] [Comment Edited] (ARROW-10152) "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow"

2020-10-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208154#comment-17208154 ] Pac A. He edited comment on ARROW-10152 at 10/5/20, 4:03 PM: -

[jira] [Closed] (ARROW-10152) "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow"

2020-10-05 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He closed ARROW-10152. - > "ImportError: liborc.so" with miniconda pyarrow=1.0.1 when "import pyarrow" >