I've opened https://issues.apache.org/jira/browse/ARROW-11945 to track this improvement.
On Tue, Mar 16, 2021 at 3:40 PM Wes McKinney <wesmck...@gmail.com> wrote: > It seems fine to me to return Result from KernelInit (as long as a > context of some kind of still passed in to the function to have the > possibility of passing other configuration bits), and it would make > development more convenient to use a common error handling strategy in > more places > > On Fri, Mar 12, 2021 at 8:52 AM Antoine Pitrou <anto...@python.org> wrote: > > > > > > I wouldn't mind changing those APIs to return a Status. > > I'll also note that KernelContext::SetStatus() isn't thread-safe. > > > > Regards > > > > Antoine. > > > > > > Le 12/03/2021 à 11:40, Benjamin Kietzman a écrit : > > > My primary point is that using KernelContext to hold error statuses is > > > confusing > > > since there are more places to check for an error condition. In the > rest of > > > the > > > c++ library we use RETURN_NOT_OK or ARROW_ASSIGN_OR_RAISE to > > > handle stack unwinding from an error, but in the presence of > KernelContext > > > it's > > > also necessary to check KernelContext::HasError. > > > > > > The specific case of a Kernel::init which must allocate actually > provides > > > a good example of this: KernelContext includes helper methods > AllocateBuffer > > > and AllocateBitmap which use the context's memory pool. These return > > > Result<>, > > > whose status must then be checked and errors propagated using > > > KernelContext::SetStatus (and not ARROW_ASSIGN_OR_RAISE as > > > in the rest of the c++ library) since Kernel::init doesn't support > status > > > returns. > > > > > > IMHO it'd increase readability of kernel code to handle errors > uniformly > > > wherever possible. > > > > > > On Fri, Mar 12, 2021, 01:00 Yibo Cai <yibo....@arm.com> wrote: > > > > > >> Beside reporting errors, maybe a kernel wants to allocate memory > through > > >> KernelContext::memory_pool [1] in Kernel::init? > > >> I'm not quite sure if this is a valid case. Would like to hear other > > >> comments. > > >> > > >> [1] > > >> > https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernel.h#L95 > > >> > > >> Yibo > > >> > > >> On 3/12/21 5:24 AM, Benjamin Kietzman wrote: > > >>> KernelContext is a tuple consisting of a pointers to an ExecContext > and > > >>> KernelState > > >>> and an error Status. The context's error Status may be set by compute > > >>> kernels (for > > >>> example when divide-by-zero would occur) rather than returning a > Result > > >> as > > >>> in the > > >>> rest of the codebase. IIUC the intent is to avoid branching on > always-ok > > >>> Statuses > > >>> for kernels which don't have an error condition (for example addition > > >>> without overflow > > >>> checks). > > >>> > > >>> If there's a motivating performance reason for non standard error > > >>> propagation then > > >>> we should continue using KernelContext wherever we can benefit from > it. > > >>> However, > > >>> several other APIs (such as Kernel::init) also use a KernelContext to > > >>> report errors. > > >>> IMHO, it would be better to avoid the added cognitive overhead of > > >> handling > > >>> errors > > >>> through KernelContext outside hot loops which benefit from it. > > >>> > > >>> Am I missing anything? Is there any reason (for example) Kernel::init > > >>> shouldn't just > > >>> return a Result<unique_ptr<KernelState>>? > > >>> > > >>> Ben Kietzman > > >>> > > >> > > > >