[ 
https://issues.apache.org/jira/browse/ARROW-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285196#comment-17285196
 ] 

Antoine Pitrou commented on ARROW-11381:
----------------------------------------

No need to hurry on this, as LZ4 format in Parquet is unfortunately unspecified 
and the C++ Parquet implementation is also running into trouble trying to be 
compatible with the reference Java implementation (named "parquet-mr").

> [Rust] [Parquet] LZ4 compressed files written in Rust can't be opened with C++
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-11381
>                 URL: https://issues.apache.org/jira/browse/ARROW-11381
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Neville Dipale
>            Priority: Major
>
> Parquet files that are written with LZ4 compression, cannot be read from 
> pyarrow. It seems that the issue might be the LZ4 block vs frame, which we're 
> also seeing in ARROW-8767.
> I'll update this JIRA with more info, as I'm struggling to get pyspark up on 
> MacOS (Rosetta 2 issues)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to