Performance of ArrowJS in the DOM

2020-07-02 Thread Matthias Vallentin
Hi folks, We are reaching out to better understand the performance of ArrowJS when it comes to viewing large amounts of data (> 1M records) in the browser’s DOM. Our backend (https://github.com/tenzir/vast) spits out record batches, which we are accumulating in the frontend with a RecordBatchReade

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-18 Thread Matthias Vallentin
We are very interested in Plasma as a stand-alone project. The fork would hit us doubly hard, because it reduces both the appeal of an Arrow-specific use case as well as our planned Ray integration. We are developing effectively a database for network activity data that runs with Arrow as data pla

Re: Rust/Datafusion sort kernel issues

2020-09-02 Thread Matthias Vallentin
Would it perhaps make sense to define the total order for non-numbers (NaN, Inf, -Inf) globally (i.e., in the spec or in Arrow itself) so that the behavior is the same across all languages? On Fri, Aug 28, 2020 at 7:42 PM Andy Grove wrote: > Hi Jörn, > > I agree with your concerns about NaN. The

Re: Floating-point order

2020-09-02 Thread Matthias Vallentin
want a NaN in the input to > force the result to NaN (as the IEEE spec would say), or you may want > NaNs to be ignored. NumPy has two functions (`sum` and `nansum`) for > these two different behaviours. > > Regards > > Antoine. > > > Le 02/09/2020 à 11:40, Matthias Vall

Re: [DISCUSS] Rethinking our approach to scheduling CPU and IO work in C++?

2020-09-22 Thread Matthias Vallentin
We are building a highly concurrent database for security data with Arrow as data plane (VAST ), so I thought I'll share our view on this since we went over pretty much all of the above mentioned questions. I'm not trying to say "you should do it this way" but instea

General questions about Arrow & Plasma

2017-11-16 Thread Matthias Vallentin
Two question about Plasma; my use case is sharing Arrow data between a C++ and Python application (eventually also R). 1. What's the typical memory allocation procedure when using Plasma and Arrow? Do I first construct a builder, populate it, finish it, and *then* copy it into mmaped buffe

Re: General questions about Arrow & Plasma

2017-11-18 Thread Matthias Vallentin
without copies and our Python serialization will also be able to take limited advantage of this. -- Philipp On Thu, Nov 16, 2017 at 7:30 AM, Matthias Vallentin wrote: Two question about Plasma; my use case is sharing Arrow data between a C++ and Python application (eventually also R). 1. What&#

Re: General questions about Arrow & Plasma

2017-11-25 Thread Matthias Vallentin
Here are some more examples on how to interact between Plasma and Arrow: http://arrow.apache.org/docs/python/plasma.html, see also the C++ documentation: http://arrow.apache.org/docs/cpp/md_tutorials_plasma.html I'm browsing through the C++ API documentation and have trouble finding the right AP

[jira] [Created] (ARROW-4319) plasma/store.h pulls ins flatbuffer dependency

2019-01-22 Thread Matthias Vallentin (JIRA)
Matthias Vallentin created ARROW-4319: - Summary: plasma/store.h pulls ins flatbuffer dependency Key: ARROW-4319 URL: https://issues.apache.org/jira/browse/ARROW-4319 Project: Apache Arrow