Thanks for the quick response, Wenchen! I'll leave this thread for early tomorrow so that someone in US timezone can chime in, and craft a patch if no one objects.
On Wed, Dec 11, 2019 at 4:41 PM Wenchen Fan <cloud0...@gmail.com> wrote: > PartitionReader extends Closable, seems reasonable to me to do the same > for DataWriter. > > On Wed, Dec 11, 2019 at 1:35 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> Hi devs, >> >> I'd like to propose to add close() on DataWriter explicitly, which is the >> place for resource cleanup. >> >> The rationalization of the proposal is due to the lifecycle of >> DataWriter. If the scaladoc of DataWriter is correct, the lifecycle of >> DataWriter instance ends at either commit() or abort(). That makes >> datasource implementors to feel they can place resource cleanup in both >> sides, but abort() can be called when commit() fails; so they have to >> ensure they don't do double-cleanup if cleanup is not idempotent. >> >> I've checked some callers to see whether they can apply >> "try-catch-finally" to ensure close() is called at the end of lifecycle for >> DataWriter, and they look like so, but I might be missing something. >> >> What do you think? It would bring backward incompatible change, but given >> the interface is marked as Evolving and we're making backward incompatible >> changes in Spark 3.0, so I feel it may not matter. >> >> Would love to hear your thoughts. >> >> Thanks in advance, >> Jungtaek Lim (HeartSaVioR) >> >> >>