I was able to reproduce the issue with some manual intervention on the same 1 node setup.
1. Using swift-get-nodes, I found the exact order of nodes in which Swift was going to attempt to write an object. 2. Then I manually unmounted the primary and first handoff disk. 3. Then I wrote the object using the swift command line tool. It succeeded. As expected, the object went into the second handoff node. 4. Then I remounted all the disks. 5. Then used the swift command line tool to issue a 'stat' on that object. It failed with a 404 and logs showed that only the primary node and first handoff node were accessed. 6. I then bumped up request_node_count to 4 and restarted the proxy server. 7. The same 'stat' command that failed in step 5 above now succeeded. The logs showed that the object was accessed from the second handoff node. For now, we could live with adding request_node_count == disk count. But it would be great to find if this can happen in non-single replica setups too and why request_node_count affects reads and not writes. Regardless, ClayG, you've been remarkable!! Thanks a ton! -Shri On Tue, May 24, 2016 at 10:46 PM, Mark Kirkwood <mark.kirkw...@catalyst.net.nz> wrote: > On 25/05/16 15:10, Clay Gerrard wrote: >> >> At the risk of repeating myself: >> >> On Tue, May 24, 2016 at 5:30 PM, Clay Gerrard <clay.gerr...@gmail.com >> <mailto:clay.gerr...@gmail.com>> wrote: >> >> >> This inconsistency in search depth based on the per-worker error >> limiting may be something worth looking into generally - but it's >> probably mostly hidden on clusters that are going 3 or more nodes >> deep into handoffs in the default case. >> >> >> I just don't know how much of an issue it is for clusters where each >> device in the nodes iter is going to be on a separate node, probably in >> independent failure domais - and you've got to not only wiff multiple >> primaries but multiple handoffs too. More over - you'd have to have *all* >> the replicas get past request_node_count or else one of the earlier replicas >> is going to service the read anyway without ever noticing the write that >> landed further out. Or even if my theory about error limiting maybe >> allowing a PUT to go deeper than the request_node_count holds... >> >> Obviously the data got on second handoff device somehow - it be >> interesting to see the transaction logs for the write that did that - but >> there's a lot of ifs in there and I'm not sure it's an issue except in the >> single replica single node case. Still sorta interesting I guess... >> >> > > Yeah, single replica is pretty special (maybe useful for teasing out some > unusual bugs to examine). > > I'll have another look at a slow system I have here (3 replicas, 3 nodes > each with 6 devices or thereabouts) that *was* exhibiting 404's when trying > to read just created containers or objects. *If* it is not immediately > misconfigured and if I can still reproduce the behaviour I'll post (new > thread to avoid confusion). > > regards > > Mark _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack