I think it has to do with how the vnodes are partitioned against your physical nodes. You really need a minimum of three physical nodes (or virtual machines) to deploy and or do any failure testing.
-Alexander On Mon, Feb 28, 2011 at 13:29, Luca Spiller <l...@stackednotion.com> wrote: > Hi all, > > I've come across some issues while testing what happens when failures happen > on our system, for example a machine failing. One of the (slightly scary) > issues I have come across is for a short while when a Riak node goes down, > data that is read from another node isn't always consistent. I have written > a small test script to demonstrate this issue: > > https://gist.github.com/847749 > Halfway through I switch off a node; here are the results: > > Deleted 0 > Wrote 100 454551 > 1298916758: 100 454551 > 1298916759: 100 454551 > 1298916760: 100 454551 > 1298916761: 100 454551 > 1298916762: 100 454551 > 1298916762: 100 454551 > 1298916763: 100 454551 > 1298916764: 100 454551 (Shutdown around here) > 1298916765: 100 454551 > 1298916766: 99 460532 > 1298916767: 91 412241 > 1298916768: 100 454551 > 1298916769: 100 454551 > 1298916770: 100 454551 > 1298916771: 100 454551 > 1298916772: 100 454551 > 1298916773: 100 454551 > 1298916774: 100 454551 > 1298916775: 100 454551 > 1298916776: 100 454551 > 1298916777: 100 454551 > ^C1298916777: 100 454551 > Deleted 100 > > Slightly more scary is that it appears to sometimes read old (deleted) data: > > Deleted 0 > Wrote 100 495792 > 1298916784: 100 495792 > 1298916785: 100 495792 > 1298916786: 100 495792 > 1298916786: 100 495792 (Shutdown around here) > 1298916787: 100 495792 > 1298916788: 100 487322 > 1298916789: 100 495792 > 1298916790: 100 495792 > 1298916791: 100 495792 > 1298916792: 100 495792 > 1298916793: 100 495792 > 1298916794: 100 495792 > 1298916795: 100 495792 > ^C1298916796: 100 495792 > 1298916797: 100 495792 > Deleted 100 > > This is using the Ripple library (0.8.3) talking directly to the local node, > however I believe the same problem is happening when using the Erlang PBC > library. This problem seems to be exacerbated when there are larger amounts > of data being stored in Riak, and the eventual consistency takes longer to > occur. > I am quite puzzled as to why this is happening, I could kind of understand > if data went missing, but the eventual consistency is what puzzles me, I > only have two nodes, so why does the data eventually sort itself out? > Secondly why does this still happen even with W and DW set to 3 > (I originally had the script using the default values, but thought I would > try this)? > Both of the nodes are running Riak 0.14.0, here are the relavent configs: > https://gist.github.com/847756 > Apologies if I am just doing something stupid, it has been a rather long day > :) > Regards, > Luca Spiller > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com