Hi Bob,
Thanks for reporting the issues.
I remember encountering the same problems with the JDBC tests (over one
year ago).
Maybe it is not just related to the time zone, it is also related to the
machine locale.
I think we can open an issue to track it.
Best,
Liya Fan
On Fri, Mar 12, 2021 at
Beside reporting errors, maybe a kernel wants to allocate memory through
KernelContext::memory_pool [1] in Kernel::init?
I'm not quite sure if this is a valid case. Would like to hear other comments.
[1]
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernel.h#L95
Yibo
On 3/
Hi Bob,
Thanks for some feedback, I don't think a lot of people are developing on
windows. Some answers in line:
* Build does require Java 8, not "8 or later" as stated in java/README.md
> There's a reference to sun.misc.Unsafe
> in
> memory/memory-core/src/main/java/org/apache/arrow/memory/ut
Great, thanks for the responses! That all makes sense :)
On Thu, Mar 11, 2021 at 1:29 PM Benjamin Kietzman
wrote:
> Hi Aldrin,
>
> We don't have a unified repository for design docs that I'm aware of.
> Governance-wise only JIRA and the mailing lists are canonical, but
> IIUC it'd be legal, stra
My mail client took out all the linefeeds, so let me reformat; sorry about that!
In the process of slogging through the build, I've bumped into various issues.
I'm happy to document them in java/README.md or make any other changes that
might be helpful to others.
I'm pretty experienced with J
I've been mostly lurking for awhile, but I would like to start picking off some
bugs in the Java implementation.In the process of slogging through the build,
I've bumped into various issues. I'm happy to document them in java/README.md
or make any other changes that might be helpful to others. I
Hi Aldrin,
We don't have a unified repository for design docs that I'm aware of.
Governance-wise only JIRA and the mailing lists are canonical, but
IIUC it'd be legal, straightforward, and beneficial to provide a directory
like the one you describe of "design docs proposed to the ML" or so.
Ben K
KernelContext is a tuple consisting of a pointers to an ExecContext and
KernelState
and an error Status. The context's error Status may be set by compute
kernels (for
example when divide-by-zero would occur) rather than returning a Result as
in the
rest of the codebase. IIUC the intent is to avoid
Hi Jack,
Thanks for the input, and there are some interesting ideas there.
If we were looking to break this into separate donations though I would
actually consider 2+3 to be the first piece to incorporate into DataFusion
because it would provide much better scalability compared to the current
mo
FYI, I opened up https://github.com/lz4/lz4-java/issues/176 to discuss
support for dependent frames.
On Thu, Mar 11, 2021 at 11:59 AM David Li wrote:
> At least for Flight, I don't think we'd use that. Right now the way
> compression is supported is the same way as with Feather, i.e. the body
>
This is a new document that we just started earlier this week. I'd put
together some docs in the past to try to bootstrap community
organization on this, but since we're now finally putting hands to
code after setting up some critical dependencies (like the Datasets
interface, which is needed to im
At least for Flight, I don't think we'd use that. Right now the way compression
is supported is the same way as with Feather, i.e. the body buffers in each
individual record batch sent on the wire are compressed, but not the stream as
a whole. (And so far we haven't found a compelling benefit fo
Hi Ben, thanks for the link!
I will eventually be interested in this direction as well, but hadn't seen
this document. Is there a place where these design documents can be found?
I've seen this and a few other google doc links floating around the mailing
list but I can't figure out how to navigate
Hey Andy
I want to discuss the areas of Ballista code that you proposed above to
move to Arrow. These are:
1. serde code for translating between protobuf and
Arrow/DataFusion/Ballista data structures
2. Distributed query planner
3. Scheduler process that coordinates query execution across availabl
Le 11/03/2021 à 19:54, Micah Kornfield a écrit :
Indeed, I don't think it was discussed publicly. The LZ4 frame format
has several things going for it:
- it allows streaming compression and decompression (meaning you can
avoid loading a huge compressed buffer at once)
Is this something we m
>
> Indeed, I don't think it was discussed publicly. The LZ4 frame format
> has several things going for it:
> - it allows streaming compression and decompression (meaning you can
> avoid loading a huge compressed buffer at once)
Is this something we make use of or intend to make use of?
> - it
"Is https://github.com/lz4/lz4-java the fast Java lz4 library in
question? The incompleteness of this implementation is a known problem
for other user communities, not only Arrow. It would be a great public
service to improve it so that it fully implements the lz4 frame
specification."
Very much +
I prefer the lz4 frame format for the reasons that Antoine stated.
To be friendly to users, the Arrow IPC documentation could mention
that lz4 compression may break Java interoperability. If block
dependency is the only obstacle to Java interoperability, the Arrow
IPC implementation could disable
Hi,
This is not yet implemented but it is on the roadmap for the near future:
https://docs.google.com/document/d/1AyTdLU-RxA-Gsb9EsYnrQrmqPMOYMfPlWwxRi1Is1tQ
Ben Kietzman
On Thu, Mar 11, 2021 at 12:33 PM Kirill Lykov
wrote:
> Hi,
>
> Is it possible somehow using existing compute functionality
Thanks, Micah.
Regarding integration testing, we currently have an integration test script
in the repo that spins up multiple processes in docker compose and runs
through a series of queries on a data set that can be generated locally. I
invested in some modest hardware (a refurbed 12 core prolian
Le 11/03/2021 à 17:58, Micah Kornfield a écrit :
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library (dependendent blocks can't be read, and by default that is what
the C++ code emits).
Hi,
Is it possible somehow using existing compute functionality or some other
code to join two tables by values in a common column?
--
Best regards,
Kirill Lykov
What about the JNI bindings for lz4-c?
Le 11/03/2021 à 18:20, Micah Kornfield a écrit :
I looked a little closer and it looks like it only supports Block format
(in the code I didn't couldn't find any references to Frame).
On Thu, Mar 11, 2021 at 9:16 AM Antoine Pitrou wrote:
Have you tr
I looked a little closer and it looks like it only supports Block format
(in the code I didn't couldn't find any references to Frame).
On Thu, Mar 11, 2021 at 9:16 AM Antoine Pitrou wrote:
>
> Have you tried another Java LZ4 library (I think you mentioned Airlift
> on a PR)?
>
>
> Le 11/03/2021
Have you tried another Java LZ4 library (I think you mentioned Airlift
on a PR)?
Le 11/03/2021 à 17:58, Micah Kornfield a écrit :
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library (dependendent blocks can't be read, and by default that is what
the C++ code emits). The only library we found (Apache Commons) that seem
I think having Ballista in Arrow sounds like a good idea in the short
term. It sounds like there is enough developer pain, that bringing it here
makes sense (providing existing Ballista contributors are happy with the
change and current Rust maintainers are open to the work involved).
One longer
Now that we have the ability to vote on source releases for patch releases,
with each implementation having more freedom to release outside of the
major release process, we need to document how to do this for the Rust
implementation (and this is probably of interest to other implementations
as well
Arrow Build Report for Job nightly-2021-03-11-0
All tasks:
https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-03-11-0
Failed Tasks:
- conda-linux-gcc-py37-cpu-r40:
URL:
https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-03-11-0-azure-conda-linux
29 matches
Mail list logo