Thanks Bill, A couple of questions:
1. why do we need both restore and restoreAll, why can't we just have one, that takes a collection (i.e., restore all)? Are there cases when people want to restore one at a time? In that case, they could still use restoreAll with just 1 record in the collection right? 2. I don't quite get "onBatchRestored". Could you put a small comment on top of all three methods. An example might help here. Thanks Eno > On 8 Jun 2017, at 18:05, Bill Bejeck <bbej...@gmail.com> wrote: > > Guozhang, Damian thanks for the comments. > > Giving developers the ability to hook into StateStore recovery phases was > part of my original intent. However the state the KIP is in now won't > provide this functionality. > > As a result I'll be doing a significant revision of KIP-167. I'll be sure > to incorporate all your comments in the new revision. > > Thanks, > Bill > > On Wed, Jun 7, 2017 at 5:24 AM, Damian Guy <damian....@gmail.com> wrote: > >> I'm largely in agreement with what Guozhang has suggested, i.e., >> StateRestoreContext shouldn't have any setters on it and also need to have >> the end offset available such that people can use it derive progress. >> Slightly different, maybe the StateRestoreContext interface could be: >> >> long beginOffset() >> long endOffset() >> long currentOffset() >> >> One further thing, this currently doesn't provide developers the ability to >> hook into this information if they are using the built-in StateStores. Is >> this something we should be considering? >> >> >> On Tue, 6 Jun 2017 at 23:32 Guozhang Wang <wangg...@gmail.com> wrote: >> >>> Thanks for the updated KIP Bill, I have a couple of comments: >>> >>> 1) I'm assuming beginRestore / endRestore is called only once per store >>> throughout the whole restoration process, and restoreAll is called per >>> batch. In that case I feel we can set the StateRestoreContext as a second >>> parameter in restoreAll and in endRestore as well, and let the library to >>> set the corresponding values instead and only let users to read (since >> the >>> collection of key-value pairs do not contain offset information anyways >>> users cannot really set the offset). The "lastOffsetRestored" would be >> the >>> starting offset when called on `beginRestore`. >>> >>> 2) Users who wants to implement their own batch restoration callbacks >> would >>> now need to implement both `restore` and `restoreAll` while they either >> let >>> `restoreAll` to call `restore` or implement the logic in `restoreAll` >> only >>> and never call `restore`. Maybe we can provide two abstract impl of >>> BatchingStateRestoreCallbacks which does beginRestore / endRestore as >>> no-ops, with one callback implementing `restoreAll` to call abstract >>> `restore` while the other implement `restore` to throw "not supported >>> exception" and keep `restoreAll` abstract. >>> >>> 3) I think we can also return the "offset limit" in StateRestoreContext, >>> which is important for users to track the restoration progress since >>> otherwise they could not tell how many percent of restoration has >>> completed. I.e.: >>> >>> public interface BatchingStateRestoreCallback extends >> StateRestoreCallback >>> { >>> >>> void restoreAll(Collection<KeyValue<byte[], byte []>> records, >>> StateRestoreContext >>> restoreContext); >>> >>> void beginRestore(StateRestoreContext restoreContext); >>> >>> void endRestore(StateRestoreContext restoreContext); >>> } >>> >>> public interface StateRestoreContext { >>> >>> long lastOffsetRestored(); >>> >>> long endOffsetToRestore(); >>> >>> int numberRestored(); >>> } >>> >>> >>> Guozhang >>> >>> >>> >>> On Fri, Jun 2, 2017 at 9:16 AM, Bill Bejeck <bbej...@gmail.com> wrote: >>> >>>> Guozhang, Matthias, >>>> >>>> Thanks for the comments. I have updated the KIP, (JIRA title and >>>> description as well). >>>> >>>> I had thought about introducing a separate interface altogether, but >>>> extending the current one makes more sense. >>>> >>>> As for intermediate callbacks based on time or number of records, I >> think >>>> the latest update to the KIP addresses this point of querying for >>>> intermediate results, but it would be per batch restored. >>>> >>>> Thanks, >>>> Bill >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Jun 2, 2017 at 8:36 AM, Jim Jagielski <j...@jagunet.com> wrote: >>>> >>>>> >>>>>> On Jun 2, 2017, at 12:54 AM, Matthias J. Sax < >> matth...@confluent.io> >>>>> wrote: >>>>>> >>>>>> With regard to backward compatibility, we should not change the >>> current >>>>>> interface, but add a new interface that extends the current one. >>>>>> >>>>> >>>>> ++1 >>>>> >>>>> >>>> >>> >>> >>> >>> -- >>> -- Guozhang >>> >>