Apologies for resurrecting this thread after so long. I’ve looked over the thread again today and it seems there is general consensus on the desired semantics. I will start a vote thread.
B. > On 24 Jul 2020, at 18:27, Nick Vatamaniuc <vatam...@gmail.com> wrote: > > Great discussion everyone! > > For normal replications, I think it might be nice to make an exception > and allow server-side pagination for compatibility at first, with a > new option to explicitly enable strict snapshots behavior. Then, in a > later release make it the default to match _all_docs and _view reads. > In other words, for a short while, we'd support bi-directional > replications between 4.x and 1/2/3.x on any replicator and document > that fact, then after a while will switch that capability off and > users would have to run replications on a 4.x replicator only, or > specially updated 3.x replicators. > >> I'd rather support this scenario than have to support explaining why the >> "one shot" replication back to an old 1.x, when initiated by a 1.x cluster, >> is returning results "ahead" of the time at which the one-shot replication >> was started. > > Ah, that won't happen in the current fdb prototype branch > implementation. What might happen is there would be changes present in > the changes feed that happened _after_ the request has started. That > won't be any different than if a node where replication runs restarts, > or there is a network glitch. The changes feed would proceed from the > last checkpoint and see changes that happened after the initial > starting sequence and apply them in order (document "a" was deleted, > then it was updated again then deleted again, every change will be > applied incrementally to the target, etc). > > We'd have to document the fact that a single snapshot replication from > 4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we > compare the update sequence and db was not updated in the meantime or > the new FDB storage engine allows it). The question then becomes if > we allow the pagination to happen on the client or the server. In case > of normal replication I think it would be nice to allow it to happen > on the server for a bit to allow for maximum initial replication > interoperability. > >> For cases where you’re not concerned about the snapshot isolation (e.g. >> streaming an entire _changes feed), there is a small performance benefit to >> requesting a new FDB transaction asynchronously before the old one actually >> times out and swapping over to it. That’s a pattern I’ve seen in other FDB >> layers but I’m not sure we’ve used it anywhere in CouchDB yet. > > Good point, Adam. We could optimize that part, yeah. Fetch a GRV after > 4.9 seconds or so and keep it ready to go for example. So far we tried > to react to the transaction_too_old exception, as opposed to starting > a timer there in order to allow us to use the maximum time a tx is > alive, to save a few seconds or milliseconds. That required some > tricks such as handling the exception bubbling up from either the > range read itself, or from the user's callback (say if user code in > the callback fetched a doc body which blew up with a > transaction_too_old exception). As an interesting aside, from quick > experiments I had noticed we were able to stream about 100-150k rows > from a single tx snapshot, that wasn't too bad I thought. > > Speaking of replication, I am trying to see what the replicator might > look like in 4.x in the https://github.com/apache/couchdb/pull/3015 > (prototype/fdb-replicator branch). It's very much a wip and hot mess > currently. Will issue an RFC once I have a better handle on the > general shape of it. So far it's based on couch_jobs, with a global > queue and looks like it might be smaller overall, as it's leveraging > the scheduling capabilities already present in couch_jobs, and but > once started individual replication job process hierarchy is largely > the same as before. > > Cheers, > -Nick > > > > > > On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát > <bes...@apache.org> wrote: >> >> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt <j...@apache.org> wrote: >>> I’m not sure why a URL parameter vs. a path makes a big difference? >>> >>> Do you have an example? >>> >>> Best >>> Jan >>> — >> >> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java >> methods (like [2] out of spec [3]) per path per verb. >> Java's type safety and the way methods are currently generated don't >> really provide an easy way to retrieve multiple kinds of responses, so >> having them separate would help a lot there. >> >> >> Donat >> >> PS. I'm getting self-conscious about discussing this in this thread. >> Should I open a new one? >> >> >> [1] https://openapi-generator.tech/ >> [2] >> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606 >> [3] >> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208