According the comments around the code I was hope so, but suddenly the issue is looks like OTP-9167 and acts as OTP-9167 - I'm not sure how to classify it else then OTP-9167. However, I also have a feeling that some code is missed around this note: https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator.erl#L299
-- ,,,^..^,,, On Sun, May 4, 2014 at 7:43 PM, Robert Samuel Newson <[email protected]> wrote: > Hrm, OTP-9167 was reported by Filipe, the main author of the current couchdb > replicator, and he also changed how this was handled in couchdb to > compensate. This is some ancient stuff, hard to believe it’s the cause of our > latest issue. I must be missing something. > > On 4 May 2014, at 12:46, Alexander Shorin <[email protected]> wrote: > >> On Wed, Apr 30, 2014 at 4:17 PM, Alexander Shorin <[email protected]> wrote: >>> On Tue, Apr 29, 2014 at 5:56 PM, Alexander Shorin <[email protected]> wrote: >>>> On Wed, Apr 23, 2014 at 1:03 PM, Mutton, James <[email protected]> wrote: >>>>> well, bummer. Tried 3 times on R14B01, all 3 I get: >>>>> /tmp/couchdb/dist/apache-couchdb-1.6.0/apache-couchdb-1.6.0/_build/../src/couch_replicator/test/07-use-checkpoints.t >>>>> .......... Failed 4/16 subtests >>>>> >>>>> Test Summary Report >>>>> ------------------- >>>>> /tmp/couchdb/dist/apache-couchdb-1.6.0/apache-couchdb-1.6.0/_build/../src/couch_replicator/test/07-use-checkpoints.t >>>>> (Wstat: 0 Tests: 16 Failed: 4) >>>>> Failed tests: 9, 12-13, 15 >>>>> Files=7, Tests=1832, 150 wallclock secs ( 0.81 usr 0.09 sys + 155.32 >>>>> cusr 13.16 csys = 169.38 CPU) >>>>> Result: FAIL >>>>> make[3]: *** [check] Error 1 >>>>> >>>>> Unfortunately, I’m needing some sleep then leaving on some vacation for >>>>> the rest of the week. I’ll see if I can maybe look closer at what’s >>>>> going on locally while on the flight. >>>> >>>> I'm failed to reproduce this with R14B04, but will try to R14B01 as you >>>> have. >>> >>> Confirmed for R14B01. >> >> Ok, I've found the roots of this issue. It's even named as OTP-9167 as >> was fixed in R14B03 and because of it 07-use-checkpoints.t fails for >> R14B01: it couldn't run replicator worker with new child spec where >> use_checkpoint bit flipped because supervisor hold the initial one, it >> see that there replication with the same id going to happen and >> restarts it with the old spec ignoring any changes. I could fix the >> test, but I couldn't fix the issue in root and not sure that it's >> worths to search for any workarounds nowdays (R14B03 was released at >> 2011-05-24, almost 3 years ago). >> >> However, here are three solutions that I have: >> >> 0. Do nothing. >> 1. Isolate tests from each other to hide the issue (isolation is good, >> but hiding bugs is bad): >> https://www.friendpaste.com/1lnTEFg6RId5PDRAmvbBVO >> 2. On test failure check Erlang version and note that this failure is >> *fine* for specific versions: >> https://www.friendpaste.com/3TmqoNjEF3xnYtbLybSL7G >> 3. Add "+no_checkpoints" suffix to replication id if "use_checkpoints: >> false" was specified. Thus, it solves the problem: >> https://www.friendpaste.com/3TmqoNjEF3xnYtbLybSKpT (yes, some >> refactoring love is required) >> But I'm not sure that this is good idea. >> >> Personally, I would prefer to keep this "bug" alive as reminder that >> things for your Erlang version *could* happens wrong. Your thoughts? >> >> >> -- >> ,,,^..^,,, >
