[ https://issues.apache.org/jira/browse/ARROW-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285196#comment-17285196 ]
Antoine Pitrou commented on ARROW-11381: ---------------------------------------- No need to hurry on this, as LZ4 format in Parquet is unfortunately unspecified and the C++ Parquet implementation is also running into trouble trying to be compatible with the reference Java implementation (named "parquet-mr"). > [Rust] [Parquet] LZ4 compressed files written in Rust can't be opened with C++ > ------------------------------------------------------------------------------ > > Key: ARROW-11381 > URL: https://issues.apache.org/jira/browse/ARROW-11381 > Project: Apache Arrow > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Neville Dipale > Priority: Major > > Parquet files that are written with LZ4 compression, cannot be read from > pyarrow. It seems that the issue might be the LZ4 block vs frame, which we're > also seeing in ARROW-8767. > I'll update this JIRA with more info, as I'm struggling to get pyspark up on > MacOS (Rosetta 2 issues) -- This message was sent by Atlassian Jira (v8.3.4#803005)