We should be extending the archery ipc integration tests for this (ideally
no files checked in)
On Thursday, January 28, 2021, Fan Liya wrote:
> Hi Joris,
>
> The Java support for lz4 compression is on-going (
> https://github.com/apache/arrow/pull/8949).
> Integration with C++/Python is not fin
Hi Joris,
The Java support for lz4 compression is on-going (
https://github.com/apache/arrow/pull/8949).
Integration with C++/Python is not finished yet.
We would appreciate it if you could share the file to help us with the
integration test.
Best,
Liya Fan
On Fri, Jan 29, 2021 at 2:41 AM Antoi
Le 28/01/2021 à 19:38, Wes McKinney a écrit :
> It still seems notable that our generic LZ4-compressed output stream
> cannot be read by Java (independent of Arrow and the Arrow IPC
> format).
That and the custom LZ4 framing used by Parquet-Java... Apparently the
Java ecosystem can't implement p
It still seems notable that our generic LZ4-compressed output stream
cannot be read by Java (independent of Arrow and the Arrow IPC
format).
On Thu, Jan 28, 2021 at 12:30 PM Antoine Pitrou wrote:
>
> On Thu, 28 Jan 2021 18:19:00 +
> Joris Peeters wrote:
>
> > To be fair, I'm happy to apply i
On Thu, 28 Jan 2021 18:19:00 +
Joris Peeters wrote:
> To be fair, I'm happy to apply it at IPC level. Just didn't realise that
> was a thing. IIUC what Antoine suggests, though, then just (leaving Python
> as-is and) changing my Java to
>
> var is = new FileInputStream(path.toFile());
>
Aha, OK!
Thanks for the help all. I'll keep an eye on the Java side for the IPC
compression, but for my current purpose doing full stream compression is
totally fine.
On Thu, Jan 28, 2021 at 6:22 PM Micah Kornfield
wrote:
> The application level compression Java support for compression is being
The application level compression Java support for compression is being
worked on (I would need to double check if the PR has been merged) and I
don't think its been integration tested with C++/Python I would imagine it
would run into a similar issue with not being able to decode linked blocks.
To be fair, I'm happy to apply it at IPC level. Just didn't realise that
was a thing. IIUC what Antoine suggests, though, then just (leaving Python
as-is and) changing my Java to
var is = new FileInputStream(path.toFile());
var reader = new ArrowStreamReader(is, allocator);
var schema
It might be worth opening up an issue with the lz4-java library. This
seems like the java implementation doesn't fully support the LZ4 stream
protocol?
Antoine in this case it looks like Joris is applying the compression and
decompression at the file level NOT the IPC level.
On Thu, Jan 28, 2021
Le 28/01/2021 à 17:59, Joris Peeters a écrit :
> From Python, I'm dumping an LZ4-compressed arrow stream to a file, using
>
> with pa.output_stream(path, compression = 'lz4') as fh:
> writer = pa.RecordBatchStreamWriter(fh, table.schema)
> writer.write_table(table)
>
hi Joris -- this isn't a use case that we intend for most users (we
intend for users to instead use the LZ4 compression option that is
part of the IPC format itself, rather than something that is layered
on externally), but it would be good to make sure that our LZ4 streams
are interoperable across
>From Python, I'm dumping an LZ4-compressed arrow stream to a file, using
with pa.output_stream(path, compression = 'lz4') as fh:
writer = pa.RecordBatchStreamWriter(fh, table.schema)
writer.write_table(table)
writer.close()
I then try reading this file from Java, star
12 matches
Mail list logo