I think it would be worthwhile to split the discussion into two separate
threads. One thread for compression & encodings (which are related or
even the same topic), one thread for data integrity.
Regards
Antoine.
Le 08/07/2019 à 07:22, Micah Kornfield a écrit :
>
> - Compression:
>* Us
Hi Jacques,
> That's quite interesting. Can you share more about the use case.
Sorry I realized I missed answering this. We are still investigating, so
the initial diagnosis might be off. The use-case is a data transfer
application, reading data at rest, translating it to arrow and sending it
o
Hi Micah,
Thanks for opening this discussion.
Similar to Liya Fan, I generally agree with you in most features. As you
mentioned above, we have made some attempts in our application to reduce data
size, for example, data encoding and RecordBatch compact[1], and it has
significant performance be
Hi Micah,
Thanks for opening this discussion.
For me, most of the features are super useful, especially RLE and integer
encoding.
IMO, to support these new features, we need some basic algorithms first
(e.g. sort and search).
For example, RLE and sort are often used in combination.
These new fea
Hi Paul, Jacques and Antoine,
Thank you for the valuable feedback. I'm going to try to address it all in
this e-mail to help consolidate the conversation. I've grouped my
responses by topic and included snippets from other e-mails where relevant.
*Timeline of any features: *
- So far the sentim
>
> What is the driving force for transport compression? Are you seeing that
>> as a major bottleneck in particular circumstances? (I'm not disagreeing,
>> just want to clearly define the particular problem you're worried about.)
>
>
> I've been working on a 20% project where we appear to be IO bou
Hi Micah,
Le 05/07/2019 à 20:53, Micah Kornfield a écrit :
>
> Going into more details on the specific features in the PR:
>
>1.
>
>Sparse encodings for arrays and buffers. The guiding principles behind
>the suggested encodings are to support encodings that can be exploited by
>
Hi Micah,
Similar to Jacques I'm not disagreeing, but wondering if they belong in
Arrow vs. can be done externally. I'm mostly interested in changes that
might impact SIMD processing, considering Arrow's already made conscious
design decisions to trade memory for speed. Apologies in advance if
Hi Jacques,
I think our e-mails might have crossed, so I'm consolidating my responses
from the previous e-mail as well.
I don't think most of this should be targeted for 1.0. It is a lot of
> change/enhancement and seems like it would likely substantially delay 1.0.
I agree it shouldn't block 1.
One question and a random thought:
What is the driving force for transport compression? Are you seeing that as
a major bottleneck in particular circumstances? (I'm not disagreeing, just
want to clearly define the particular problem you're worried about.)
Random thought: what do you think of defin
Hi Jacques,
Thanks for the quick response.
I don't think most of this should be targeted for 1.0. It is a lot of
> change/enhancement and seems like it would likely substantially delay 1.0.
I agree it shouldn't block 1.0. I think time based releases are working
well for the community.But if
Strange, I've pasted the contents into a google document at [1]
[1]
https://docs.google.com/document/d/1uJzWh63Iqk7FRbElHPhHrsmlfe0NIJ6M8-0kejPmwIw/edit
On Fri, Jul 5, 2019 at 12:32 PM Jacques Nadeau wrote:
> Hey Micah, you're formatting seems to be messed up on this mail. Some kind
> of copy/
Initial thought: I don't think most of this should be targeted for 1.0. It
is a lot of change/enhancement and seems like it would likely substantially
delay 1.0. The one piece that seems least disruptive would be basic on the
wire compression. You suggested that this be done on the buffer level but
Hey Micah, you're formatting seems to be messed up on this mail. Some kind
of copy/paste error?
On Fri, Jul 5, 2019 at 11:54 AM Micah Kornfield
wrote:
> Hi Arrow-dev,
>
> I’d like to make a straw-man proposal to cover some features that I think
> would be useful to Arrow, and that I would like t
14 matches
Mail list logo