[ 
https://issues.apache.org/jira/browse/ARROW-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281975#comment-17281975
 ] 

Gidon Gershinsky commented on ARROW-8040:
-----------------------------------------

Hi Itamar, to sum up my previous comments on this, there are two issues with 
exposing the low level encryption to the users,
 - security: low-level encryption API is easy to misuse (eg giving the same 
keys for a number of different files; this'd break the AES GCM cipher). The 
high-level encryption layer handles that by applying envelope encryption and 
other best practices in data security. Also, this layer is maintained by the 
community, meaning that future improvements and security fixes can be 
upstreamed by anyone, and available to all.
 - compatibility: parquet-mr implements the high-level encryption layer. If we 
want the files produced by Spark/Presto/etc to be readable by pandas/PyArrow 
(and vice versa), we need to provide the Arrow users with the high-level API. 

Actually, this layer is already implemented in C++ and sent as a PR to the 
Arrow repo ([https://github.com/apache/arrow/pull/8023).] I agree with you it 
is a long time coming, and wonder when the review can be resumed by the 
community.. 

> [Python][Packaging] Add Parquet encryption / OpenSSL to Python wheels
> ---------------------------------------------------------------------
>
>                 Key: ARROW-8040
>                 URL: https://issues.apache.org/jira/browse/ARROW-8040
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to