I agree, as a C++ library it is acceptable to let these exceptions bubble up to the client, and for the C bindings, all exceptions should be caught and translated to appropriate error codes. Most other languages that interface with it will probably use the C wrapper and will gain visibility into these exceptional conditions and handle it how they see fit.
On Tue, Jun 7, 2016 at 8:04 AM Todd Lipcon <t...@cloudera.com> wrote: > Hi Micah, > > You're right - in Kudu at least we don't make attempts to recover from > out-of-memory errors like std::bad_alloc. So, if 'new' or any of the STL > libraries throw such an error, we'll just let it propagate up the stack and > crash the server. In my experience, trying to recover from such errors is > fraught with difficulty -- often you'd want to allocate in the recovery > path, and short of implementing "emergency allocators" and such, things > just get too hard to reason about and test. > > In the context of an embeddable *C* library I do think there's some value > in eventually catching std::bad_alloc and returning error codes like > ENOMEM. However, I'd personally vote that to be given very low priority in > the general queue of features to be implemented, since I imagine that most > of the big data applications that would embed Arrow are likely to follow > the same "crash on alloc failure" anyway. > > Another thing to note is that in any cases where Arrow does allocate large > amounts of memory (eg when reading an on-disk file or user data) I do think > some form of memory limiting is advisable. For example, even though Kudu > crashes on OOM, we closely track our memory usage and start to reject user > write requests after crossing certain thresholds. Providing enough hooks > for embedders of Arrow to do similar is a good idea. > > -Todd > > On Tue, May 31, 2016 at 7:33 AM, Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > We are following the google style guide which prohibits exceptions [1]. > > Several places in our code we use std::vector and std::make_shared > > which can potentially throw std::bad_alloc exceptions that we don't > > currently wrap in try/catch blocks. These classes/objects are used > > for constructing/manipulating metadata (all errors for allocations of > > data are checked and returned via Status objects). > > > > In general, it seems unlikely that these code paths would get > > triggered, but there is still a possibility. > > > > Does anybody have any thoughts on how thorough we want to be about > > avoiding these types of edge cases? > > > > My thought is to leave the code as is and accept the small chance that > > these get hit. The main rationale is that if these types of calls do > > raise std::bad_alloc, it is highly likely that the program using the > > Arrow library is likely in an unstable state already (most of the > > allocations should be less then 100 bytes or so). A second point is > > that a cursory search of Kudu [2] and Impala [3] code bases (both of > > which also intend to follow the style guide) turn up some similar > > calls that could throw std::bad_alloc (it is possible there is some > > other magic happening here to avoid these). > > > > This isn't completely satisfying though, so I'm happy to be convinced > > otherwise. > > > > Thanks, > > -Micah > > > > > > [1] https://google.github.io/styleguide/cppguide.html#Exceptions > > [2] > > > https://github.com/cloudera/kudu/blob/7f3691a826b9d27199319409f8d721ec1d08a8ba/src/kudu/consensus/log_reader.cc#L74 > > [3] > > > https://github.com/cloudera/Impala/blob/a36dcfc0322e213c06d6cf8d3f330c4b06739523/be/src/common/object-pool.h > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- -- Cheers, Leif