I have spent some time going through the JIRA backlog and have
organized an umbrella JIRA with about 75 issues under it to help
organize building out further compute kernels and kernel execution
functionality:
https://issues.apache.org/jira/browse/ARROW-8894
On Sun, May 24, 2020 at 9:36 AM Wes Mc
I have merged the patch but left the PR open for additional code review.
On Sat, May 23, 2020 at 3:24 PM Wes McKinney wrote:
>
> To be clear given the scope of code affected I think we should merge it today
> and address further feedback in a follow up patch. I will be diligent about
> respondi
To be clear given the scope of code affected I think we should merge it
today and address further feedback in a follow up patch. I will be diligent
about responding to additional comments in the PR
On Sat, May 23, 2020, 3:19 PM Wes McKinney wrote:
> Yes you should still be able to comment. I wil
Yes you should still be able to comment. I will reopen the PR after it is
merged
On Sat, May 23, 2020, 2:52 PM Micah Kornfield wrote:
> Hi Wes,
> Will we still be able to comment on the PR once it is closed?
>
>
> If we want to be inclusive on feedback it might pay to wait until Tuesday
> evenin
Hi Wes,
Will we still be able to comment on the PR once it is closed?
If we want to be inclusive on feedback it might pay to wait until Tuesday
evening US time to merge since it is a long weekend here.
Thanks,
Micah
On Saturday, May 23, 2020, Wes McKinney wrote:
> Hi folks -- I've addressed a
Hi folks -- I've addressed a good deal of feedback and added a lot of
comments and with Kou's help have got the build passing, It would be
great if this could be merged soon to unblock follow up PRs
On Wed, May 20, 2020 at 11:55 PM Wes McKinney wrote:
>
> I just opened the PR https://github.com/a
I just opened the PR https://github.com/apache/arrow/pull/7240
I'm sorry it's so big. I really think this is the best way. The only
further work I plan to do on it is to get the CI passing.
On Wed, May 20, 2020 at 12:26 PM Wes McKinney wrote:
>
> I'd guess I'm < 24 hours away from putting up my
I'd guess I'm < 24 hours away from putting up my initial PR for this
work. Since the work is large and (for all practical purposes) nearly
impossible to separate into separately merge-ready PRs, I'll start a
new e-mail thread describing what I've done in more detail and
proposing a path for merging
I'm working actively on this but perhaps as expected it has ballooned
into a very large project -- it's unclear at the moment whether I'll
be able to break the work into smaller patches that are easier to
digest. I'm working as fast as I can to have an initial
feature-preserving PR up, but the chan
On Wed, Apr 22, 2020 at 12:41 AM Micah Kornfield wrote:
>
> Hi Wes,
> I haven't had time to read the doc, but wanted to ask some questions on
> points raised on the thread.
>
> * For efficiency, kernels used for array-expr evaluation should write
> > into preallocated memory as their default mode.
Hi Wes,
I haven't had time to read the doc, but wanted to ask some questions on
points raised on the thread.
* For efficiency, kernels used for array-expr evaluation should write
> into preallocated memory as their default mode. This enables the
> interpreter to avoid temporary memory allocations
On Tue, Apr 21, 2020 at 7:32 AM Antoine Pitrou wrote:
>
>
> Le 21/04/2020 à 13:53, Wes McKinney a écrit :
> >>
> >> That said, in the SortToIndices case, this wouldn't be a problem, since
> >> only the second pass writes to the output.
> >
> > This kernel is not valid for normal array-exprs (see t
Le 21/04/2020 à 13:53, Wes McKinney a écrit :
>>
>> That said, in the SortToIndices case, this wouldn't be a problem, since
>> only the second pass writes to the output.
>
> This kernel is not valid for normal array-exprs (see the spreadsheet I
> linked), such as what you can write in SQL
>
> K
hi Antoine,
On Tue, Apr 21, 2020 at 4:54 AM Antoine Pitrou wrote:
>
>
> Le 21/04/2020 à 11:13, Antoine Pitrou a écrit :
> >
> It would be interesting to know how costly repeated
> allocation/deallocation is. Modern allocators like jemalloc do their
> own caching instead of always returning memo
Hi Sven,
On Mon, Apr 20, 2020 at 11:49 PM Sven Wagner-Boysen
wrote:
>
> Hi Wes,
>
> I think reducing temporary memory allocation is a great effort and will
> show great benefit in compute intensive scenarios.
> As we are mainly working with the Rust and Datafusion part of the Arrow
> project I wa
Le 21/04/2020 à 11:13, Antoine Pitrou a écrit :
>
> This assumes that all these kernels can safely write into one of their
> inputs. This should be true for trivial ones, but not if e.g. a kernel
> makes two passes over its input. For example, the SortToIndices kernel
> first scans the input f
Hi Wes,
Le 18/04/2020 à 23:41, Wes McKinney a écrit :
>
> There are some problems with our current collection of kernels in the
> context of array-expr evaluation in query processing:
>
> * For efficiency, kernels used for array-expr evaluation should write
> into preallocated memory as their
Hi Wes,
I think reducing temporary memory allocation is a great effort and will
show great benefit in compute intensive scenarios.
As we are mainly working with the Rust and Datafusion part of the Arrow
project I was wondering how we could best align the concepts and
implementations on that level.
I started a brain dump of some issues that come to mind around kernel
implementation and array expression evaluation. I'll try to fill this
out, and it would be helpful to add supporting citations to other
projects about what kinds of issues come up and what implementation
strategies may be helpful
hi folks,
This e-mail comes in the context of two C++ data processing
subprojects we have discussed in the past
* Data Frame API
https://docs.google.com/document/d/1XHe_j87n2VHGzEbnLe786GHbbcbrzbjgG8D0IXWAeHg/edit
* In-memory Query Engine
https://docs.google.com/document/d/10RoUZmiMQRi_J1FcPeVAUA
20 matches
Mail list logo