On 25/05/16 15:10, Clay Gerrard wrote:
At the risk of repeating myself:

On Tue, May 24, 2016 at 5:30 PM, Clay Gerrard <clay.gerr...@gmail.com <mailto:clay.gerr...@gmail.com>> wrote:


    This inconsistency in search depth based on the per-worker error
    limiting may be something worth looking into generally - but it's
    probably mostly hidden on clusters that are going 3 or more nodes
    deep into handoffs in the default case.


I just don't know how much of an issue it is for clusters where each device in the nodes iter is going to be on a separate node, probably in independent failure domais - and you've got to not only wiff multiple primaries but multiple handoffs too. More over - you'd have to have *all* the replicas get past request_node_count or else one of the earlier replicas is going to service the read anyway without ever noticing the write that landed further out. Or even if my theory about error limiting maybe allowing a PUT to go deeper than the request_node_count holds...

Obviously the data got on second handoff device somehow - it be interesting to see the transaction logs for the write that did that - but there's a lot of ifs in there and I'm not sure it's an issue except in the single replica single node case. Still sorta interesting I guess...



Yeah, single replica is pretty special (maybe useful for teasing out some unusual bugs to examine).

I'll have another look at a slow system I have here (3 replicas, 3 nodes each with 6 devices or thereabouts) that *was* exhibiting 404's when trying to read just created containers or objects. *If* it is not immediately misconfigured and if I can still reproduce the behaviour I'll post (new thread to avoid confusion).

regards

Mark

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to