Hi Wes, I haven't had time to read the doc, but wanted to ask some questions on points raised on the thread.
* For efficiency, kernels used for array-expr evaluation should write > into preallocated memory as their default mode. This enables the > interpreter to avoid temporary memory allocations and improve CPU > cache utilization. Almost none of our kernels are implemented this way > currently. Did something change, I was pretty sure I submitted a patch a while ago for boolean kernels, that separated out memory allocation from computation. Which should allow for writing to the same memory. Is this a concern with the public Function APIs for the Kernel APIs themselves, or a lower level implementation concern? * Sorting is generally handled by different data processing nodes from > Projections, Aggregations / Hash Aggregations, Filters, and Joins. > Projections and Filters use expressions, they do not sort. Would sorting the list-column elements per row be an array-expr? On Tue, Apr 21, 2020 at 5:35 AM Wes McKinney <wesmck...@gmail.com> wrote: > On Tue, Apr 21, 2020 at 7:32 AM Antoine Pitrou <anto...@python.org> wrote: > > > > > > Le 21/04/2020 à 13:53, Wes McKinney a écrit : > > >> > > >> That said, in the SortToIndices case, this wouldn't be a problem, > since > > >> only the second pass writes to the output. > > > > > > This kernel is not valid for normal array-exprs (see the spreadsheet I > > > linked), such as what you can write in SQL > > > > > > Kernels like SortToIndices are a different type of function (in other > > > words, "not a SQL function") and so if we choose to allow such a > > > "non-SQL-like" functions in the expression evaluator then different > > > logic must be used. > > > > Hmm, I think that maybe I'm misunderstanding at which level we're > > talking here. SortToIndices() may not be a "SQL function", but it looks > > like an important basic block for a query engine (since, after all, > > sorting results is an often used feature in SQL and other languages). > > So it should be usable *inside* the expression engine, even though it's > > not part of the exposed vocabulary, no? > > No, not as part of "expressions" as they are defined in the context of > SQL engines. > > Sorting is generally handled by different data processing nodes from > Projections, Aggregations / Hash Aggregations, Filters, and Joins. > Projections and Filters use expressions, they do not sort. > > > Regards > > > > Antoine. >