On Wed, Jul 10, 2013 at 7:50 AM, Johan Corveleyn <jcor...@gmail.com> wrote: > On Wed, Jul 10, 2013 at 1:24 PM, Philip Martin > <philip.mar...@wandisco.com> wrote: >> Johan Corveleyn <jcor...@gmail.com> writes: >> >>> Can someone explain again why ra_serf / serf can't just resend >>> requests, with C-L, whenever it has received a 411? So (after we know >>> it's HTTP/1.1) send every request optimistically assuming chunked >>> encoding, and fallback whenever we get a 411 (and perhaps remember the >>> fallback, so we downgrade that entire connection / session). >>> >>> Why is that not possible? Is it because (a) it's too hard to implement >>> (lots of code paths all over the place), (b) we can't resend requests >>> because we don't have them available in the right place anymore after >>> sending, (c) it would potentially break some ordering things (really? >>> why?), (d) it would require serf 1.4 anyway, or (e) something else? >> >> Partly a), mostly c). It is relatively easy to do when ra_serf is >> processing requests syncronously. It gets much more difficult when >> ra_serf starts using pipelined requests as there may be outstanding >> requests when the 411 is received. I posted a patch that keeps track of >> outstanding requests and retries, it works in lots of cases but won't >> work in all cases. > > Is pipelining optional within (ra-)serf's logic? I.e. can the "start > pipelining" switch simply be flipped a little bit later, once we're > sure that chunking works?
Yes. And funny you mention it because I was saying the same thing to Ben on IRC last night. However: we cannot do that in a simple fashion, such that I would be comfortable suggesting a backport to 1.8.x. Consider if we adopted the "auto" approach and generally sent an extra OPTIONS request. In the 1.9.x series, we could start to optimize that away by using the next synchronous request instead. Or when we hit your question (b), if a body is hard to reproduce, then we hold that for a moment and send the OPTIONS. And we could disable pipelining for one request, while we probe the connection's level of support. > That way we won't do an extra (synchronous, > but useless) request just for detecting chunkness, but we just > continue working synchronously with our normal (useful) requests a > tiny bit longer, before switching to pipelined mode. Still a small > performance hit in general, but it might be negligible. Right. And something very doable for 1.9.x. I also want to implement request compression. That will help some during an import (where we self-diff the imported file, rather than sending a diff-against-BASE). An analysis by ghudson long ago showed that our self-diff is actually pretty good compared to gzip, but I'd prefer to use optimized gzip at both ends instead of our algorithm. > Totally making abstraction of implementation complexity, of course ... > > Of course, if you could make your patch work in all cases, Philip, > that would be even better :-). The ra_serf stuff isn't too bad. Much of the request setup/handling is in util.c. But we have about three different run loops right now (one-shot/synchronous, update, and (iirc) replay). They optimize the connection(s) in different ways. When we switch to Ev2, there will be some great optimizations in commit. In particular, alter_file(CONTENTS) means that our callback-from-serf can read the stream. No transient spillbuf/file necessary (compared to Ev1, where apply_textdelta pushes content *into* the RA commit process). Within 1.9.x, we also can switch more temp-file usage over to use the spillbuf bucket (ref: sb_bucket.c). That'll keep small items in memory rather than forced-to-disk. But back to the original point: what is the smallest patch that we're comfortable with now? From a user standpoint, and a backport-safety standpoint. The "auto" suggestion is likely our best solution for now. Ben put together a branch, but I think it's more complicated than necessary. That's fixable though :-) Cheers, -g