Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Itamar Turner-Trauring
On Mon, Feb 15, 2021, at 2:49 PM, Micah Kornfield wrote: > Sorry I realized I had a typo in my email. We should definitely namespace > dangerous apis appropriately. Decryption doesn't seem necessarily dangerous? In any case, I will start with PR for decryption only and we can see how that goes

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-15 Thread Itamar Turner-Trauring
On Fri, Feb 12, 2021, at 11:52 PM, Micah Kornfield wrote: > 2. I'm open to exposing the lower level encryption libraries in python > (without appropriate namespacing/communication). It seems at least for > reading, there is potentially less harm (I'll caveat that with I'm not a > security exper

Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-10 Thread Itamar Turner-Trauring
Hi, Since the PR for high-level C++ Parquet encryption API appears stalled (https://github.com/apache/arrow/pull/8023), I'm looking into exposing the low-level Parquet encryption API to Python. Arguments for doing this: the low-level API is all the users I'm talking to need, at the moment, so

Re: Adding Parquet encryption support to PyArrow

2020-09-08 Thread Itamar Turner-Trauring
Still learning from the discussion/docs, but in the meantime I created https://issues.apache.org/jira/projects/ARROW/issues/ARROW-9947 which has link to a

Re: Adding Parquet encryption support to PyArrow

2020-09-06 Thread Itamar Turner-Trauring
API and keytools > > > >> separately. A user would create and initialize a > > > >> PropertiesDrivenCryptoFactory and use it to create the > > > >> FileEncryptionProperties/FileDecryptionProperties to pass to the lower > > > >> level API. In pandas this would

Re: Adding Parquet encryption support to PyArrow

2020-09-03 Thread Itamar Turner-Trauring
On Thu, Sep 3, 2020, at 11:01 AM, Antoine Pitrou wrote: > > Hi Gidon, > > Le 03/09/2020 à 16:53, Gidon Gershinsky a écrit : > > Hi Itamar, > > > > My suggestion would be wrap a different API in Python - the high-level > > encryption interface of > > https://github.com/apache/arrow/pull/8023 >

Adding Parquet encryption support to PyArrow

2020-09-03 Thread Itamar Turner-Trauring
Hi, I'm looking into implementing this, and it seems like there are two parts: packaging, but also wrapping the APIs in Python. Is the latter item accurate? If so, any examples of similar existing wrapped APIs, or should I just come up with something on my own? Context: https://github.com/apac

[jira] [Created] (ARROW-6045) Benchmark for Parquet float and NaN encoding/decoding

2019-07-26 Thread Itamar Turner-Trauring (JIRA)
Itamar Turner-Trauring created ARROW-6045: - Summary: Benchmark for Parquet float and NaN encoding/decoding Key: ARROW-6045 URL: https://issues.apache.org/jira/browse/ARROW-6045 Project: Apache