Hello. I am trying to write a storage function in Pig and I'd like to know what the guarantees are on the StoreFunc's prepareToWrite , cleanupOnFailure and cleanupOnSucccess methods are.
In particular, when are these functions called? Is it once per task or once per tuple? The store that I am writing to expects a flow like Open connection. Many, many writes. Close connection. If it turns out the prepareToWrite and cleanupOnSuccess get called for every tuple, it would be very problematic on large datasets. But once per task (or so) would be reasonable. Pointers to the pig code controlling the invocation of these functions would be especially appreciated. Cheers, Nate Segerlind
